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A  common  problem  in  modeling  the  response  surface  in 

most  systems,  and  in  particular  in  a  mixture  system,  is  that 

of  detecting  lack  of  fit,  or  inadequancy,  of  a  fitted  model 

of  the  form  E(Y)  =  Xg,  in  comparison  to  a  model  of  the  form 

E{Y)  =  Xe,+  X  B   postulated  as  the  true  model.   One  method 

for  detecting  lack  of  fit  involves  comparing  the  value  of 

the  response  observed  at  certain  locations  in  the  factor 

space,  called  "check  points,"  with  the  value  of  the  response 

that  the  fitted  model  predicts  at  these  same  check  points. 

The  observations  at  the  check  points  are  used  only  for 

testing  lack  of  fit  and  are  not  used  in  fitting  the  model. 

It  is  shown  that  under  the  usual  assumptions  of 

independent  and  normally  distributed  errors,  the  lack  of  fit 

test  statistic  which  uses  the  data  at  the  check  points  is  an 

vii 


F  statistic.   When  no  lack  of  fit  is  present  the  statistic 
possesses  a  central  F  distribution,  but  in  general,  in  the 
presence  of  lack  of  fit,  the  statistic  possesses  a  doubly 
noncentral  F  distribution.   The  power  of  this  F  test  depends 
on  the  location  of  the  check  points  in  the  factor  space 
through  its  noncentrality  parameters.   A  method  of  selecting 
check  points  that  maximize  the  power  of  the  test  for  lack  of 
fit  through  their  influence  on  the  numerator  noncentrality 
parameter  is  developed. 

A  second  method  for  detecting  lack  of  fit  relies  on 
replicated  response  observations.   The  residual  sum  of 
squares  from  the  fitted  model  is  partitioned  into  a  pure 
error  variation  component  and  into  a  lack  of  fit  variation 
component.   Lack  of  fit  is  detected  if  the  lack  of  fit 
variation  is  large  in  comparison  to  the  pure  error 
variation.   This  method  can  be  generalized  when  "near 
neighbor"  observations  must  be  substituted  for  replicates. 
In  this  case,  the  test  statistic  (assuming  independent  and 
normally  distributed  errors)  has  a  central  F  distribution 
when  the  fitted  model  is  adequate  and  a  doubly  noncentral  F 
distribution  under  lack  of  fit.   The  arrangement  of  near 
neighbors  is  seen  to  affect  the  testing  procedure  and  its 
power. 
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CHAPTER  ONE 
INTRODUCTION 

1.1   The  Response  Surface  Problem 
A  mixture  problem  is  a  special  type  of  a  response 
surface  problem.   First  we  shall  define  the  general  response 
surface  problem  and  indicate  the  basic  objectives  sought  in 
its  analysis,  and  follow  this  development  with  a  discussion 
of  the  mixture  problem. 

In  the  general  response  surface  problem,  we  are  inter- 
ested in  studying  the  relationship  between  an  observable 
response,  Y,  and  a  set  of  q  independent  variables  or 
factors,  x^,    X2»  •••f  Xq,  whose  levels  are  assumed  con- 
trolled by  the  experimenter.   The  independent  variables  are 
quantitative  and  continuous.   We  express  this  relationship 
in  terms  of  a  continuous  response  function,  (j) ,  as 


^u  =  f(^ur  '^U2'  •••'  ^uq)  ■"   'u 


where  Y^  is  the  uth  of  N  observations  of  the  response  col- 
lected in  an  experiment,  and  x^^^    represents  the  uth  level  of 
the  ith  independent  variable,  u  =  1,  2,  . . . ,  N,  i  =  1,  2, 
...,  q.   The  exact  functional  relationship,  <j) ,  is  unknown. 
The  term  e^  is  the  experimental  error  of  the  uth 


observation.   It  is  assumed  that  E(e^J)  =  0,  E(e^e^l)  =  0, 
for  u   *    u\    and  E(e^)  =  a^,  for  u  =  1,  2,  ...,  N. 

As  the  form  of  (j)  is  unknown  and  may  be  quite  complex,  a 
low  order  polynomial  (usually  first  or  second  order)  in  the 
independent  variables  x-j^,  X2,    ...,  Xg  is  generally  used  to 
approximate  41.   This  may  be  justified  by  noting  that  such 
polynomials  constitute  low  order  terms  of  a  Taylor  series 
expansion  of  <{»  about  the  point  ^Cj^  =  X2  =  ...  =  x   =  0, 
(Myers,  1971,  p.  62).   Cochran  and  Cox  (1957,  p.  336)  point 
out  that  these  low  order  polynomials  may  give  a  poor  approx- 
imation to  (\)    when  extrapolated  beyond  the  experimental 
region,  and  thus  should  not  be  used  for  this  purpose. 

A  linear  response  surface  model  may  be  written  in 
matrix  notation  as 

Y  =  XB  +  e  (1.1) 

where  Y  is  an  Nxl  vector  of  observable  response  values,  X  is 
an  Nxp  matrix  of  known  constants,  a.  is  a  pxl  vector  of 
unknown  parameters  (regression  coefficients),  and  £.  is  the 
Nxl  vector  of  random  errors.   When  the  model  is  a  first  or  a 
second  degree  polynomial,  the  columns  of  X  correspond  to  the 
first  or  second  degree  powers  of  the  independent  variables 
x^,    X2,     •••,    Xg,  or  their  cross  products.   If  the  model 
contains  a  constant  term,  Sq,  the  first  column  of  X  will 
correspond  to  this  term,  and  will  consist  of  N  ones.   Since 
E(e)  =  0,  an  alternative  representation  for  the  response 


surface  model  of  (1.1)  is 

E(Y)  =  Xg  . 

Once  the  form  of  the  model  that  will  be  used  to  approx- 
imate 4)(X2/  X2»  ...f  Xq)  is  chosen,  the  next  step  is  to 
estimate  the  regression  coefficients,  a.,  and  then  use  the 
estimated  model  to  make  inferences  about  the  true  response 
function,  (j)  .  The  estimation  of  the  elements  of  a.  is  usually 
accomplished  by  ordinary  least  squares  techniques.   For  the 
purpose  of  testing  hypotheses  concerning  the  regression 

coefficients,  a.,  it  is  assumed  that  g.  has  a  normal  distribu- 
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tion,  that  is,   e_  ~  N  (0,  a  !«)  • 

Perhaps  the  most  common  objective  in  the  exploration  of 
a  response  system  is  the  determination  of  its  optimum 
operating  conditions.   By  this  we  mean  that  it  is  desired  to 
find  the  settings  of  x-^,    X2r     ...f  Xq  that  optimize  (^ ,    which 
in  some  applications  may  be  interpreted  as  maximizing  (j) , 
while  in  other  applications  a  minimum  value  of  <^    may  be  of 
interest.   It  is  also  often  desirable  to  determine  the  be- 
havior of  the  response  function  in  the  neighborhood  of  the 
optimum.   For  second  order  response  models,  such  an  investi- 
gation can  be  carried  out  by  performing  a  canonical  analysis 
of  the  second  order  surface  as  discussed  in  Myers  (1971). 

For  simple  systems  having  only  one  or  two  independent 
variables,  the  response  surface  may  be  explored  by  just 
plotting  the  fitted  response  values  against  values  taken  by 


the  independent  variables.   If  q  =  1,  implying  only  one 
independent  variable,  say  x^^,  then  a  two-dimensional  plot  of 
the  fitted  response  against  x^^  may  be  used  to  locate  the 
optimum,  as  well  as  to  investigate  the  response  behavior  in 
other  parts  of  the  experimental  range  of  x-^.   If  q  =  2,  and 
the  two  independent  variables  are  xj^  and  X2f  then  a  plot  of 
the  contours  of  constant  response  over  the  region  specified 
by  the  ranges  of  the  values  for  x^   and  X2  can  be  used  to 
describe  the  response  surface. 

The  properties  that  the  fitted  model  possesses  in  terms 
of  its  ability  to  represent  the  true  surface,  <^ ,    depend  on 
the  settings  of  x-^,    X2,     ...f  Xg  at  which  values  of  Y  are 
observed.   Thus  the  experimental  design  is  of  great  impor- 
tance.  Much  work  has  been  done  on  the  construction  of 
designs  that  are  optimal  with  respect  to  one  criterion  or 
another  involving  the  fitted  response  and/or  the  true  unfit- 
ted model.   Box  and  Draper  (1975)  list  fourteen  criteria  to 
consider  when  choosing  a  design  for  investigating  response 
surfaces.   Myers  (1971)  gives  several  designs  for  fitting 
first  and  second  order  polynomial  models.   A  discussion  of 
specific  design  considerations  will  not  be  attempted  here, 
as  such  a  discussion  is  not  the  focus  of  this  dissertation, 
and  would  necessarily  be  lengthy. 

The  initial  steps  in  the  analysis  of  a  response  system 
may  be  described  as  follows:   First  an  attempt  is  made  to 
approximate  the  true  response  function,  (})(X]^,  X2,    •••/  Xq), 
usually  with  a  low  order  polynomial  in  x^,    X21     ...f  Xg. 


After  the  form  of  the  model  has  been  chosen,  then  comes  the 
selection  of  an  appropriate  experimental  design,  which 
specifies  the  settings  of  the  independent  variables  at  which 
observed  values  of  the  response  will  be  collected.   The 
observed  values  of  the  response  are  used  in  estimating  the 
regression  coefficients  in  the  model,  using,  in  general, 
ordinary  least  squares.   After  a  test  for  "goodness  of  fit" 
of  the  model  verifies  the  fitted  model  is  adequate,  the 
fitted  model  is  used  in  determining  optimum  operating  condi- 
tions for  the  response  system. 

1.2  The  Mixture  Problem 
A  mixture  system  is  a  response  system  in  which  the 
response  depends  only  on  the  relative  proportions  of  the 
components  or  ingredients  present  in  a  mixture,  and  not  on 
the  total  amount  of  the  mixture.   For  example,  the  response 
might  be  the  octane  rating  of  a  blend  of  gasolines  where  the 
rating  is  a  function  only  of  the  relative  percentages  of  the 
gasoline  types  present  in  the  blend.   The  proportion  of  each 
ingredient  in  the  mixture,  denoted  by  xj^,  must  lie  between 
zero  and  unity,  i  =  1,  2,  ...,  q.   The  sum  of  the  propor- 
tions of  all  the  components  will  equal  unity,  that  is. 


q 

0  <  X.  <  1,  i  =  1,2,. ..,q,  I.      X.  =  1.      (1.2) 

i=l 


The  factor  space  containing  the  q  components  is  represented 
by  a  (q  -  l)-dimensional  simplex.   For  q  =  2  components,  the 
factor  space  is  a  straight  line,  whereas  for  q  =  3 


components,  the  factor  space  is  an  equilateral  triangle,  and 
for  q  =  4  components,  the  factor  space  is  represented  by  a 
regular  tetrahedron. 

The  objectives  in  the  analysis  of  a  mixture  response 
system  are,  in  general,  the  same  as  in  any  response  surface 
exploration.   That  is,  one  seeks  to  approximate  the  surface 
with  a  model  equation  by  fitting  an  equation  to  observations 
taken  at  preselected  combinations  of  the  mixture  com- 
ponents.  Another  objective  is  to  determine  the  roles  played 
by  the  individual  components.   We  shall  not  concern  our- 
selves with  this  but  rather  concentrate  on  the  empirical 
model  fit.   Once  the  model  equation  is  deemed  adequate  an 
attempt  is  made  to  determine  which  of  the  component  combina- 
tions yield  the  optimal  response.   The  models  used  to  repre- 
sent the  response  in  a  mixture  system  are  in  most  cases 
different  in  form  from  the  standard  polynomial  models.   The 
first  type  of  model  form  that  we  discuss  is  the  canonical 
polynomial  suggested  by  Scheffe. 
1.2.1   Mixture  Models 

Scheffe  (1958)  introduced  a  canonical  form  of  the  poly- 
nomial model  for  representing  the  response  in  a  mixture 
system.   These  canonical  polynomial  models  are  derived  from 
the  standard  polynomials  using  the  restrictions  on  the  Xj^ 
shown  in  (1.2).   With  q  =  2  mixture  components,  for  example, 
the  standard  second  order  polynomial  model  is  of  the  form 


2       2 


Restrictions  (1.2)  imply  that  ag  =  aQ(Xj^  +  X2)f 

^1  =  ^1^1  "  ^2^'  ^"^  ^2  ""  ^2^1  "  ^1^'  ^^^^    (1.3)  can  be 
written  in  the  canonical  form 

E(Y)  =  S^x^  +  ^2X2  +  e^2^^2' 

where   e^=  a^  +  a^  +  a^^,  ^2  =  ^0  ""  '"2  "^  ^22'  ^"^  ^12=  "l2 
-a    -  a   .   There  is  no  constant  term  in  the  above  canoni- 
cal form  and  the  pure  quadratic  terms  in  equation  (1.3)  have 
been  absorbed  in  the  x^Xj  terms. 

The  general  form  of  the  canonical  polynomial  of  degree 
d  in  q  mixture  components  can  be  written  as 


E(Y)  =  E   e, X. ,   for  d  =  1, 
i=l 


E(Y)  =  Z      g.x.  +   Z    Z    B..X.X.  ,   for  d  =  2,  and 

--ill    ^.  ■  .  ■      i;]  ID 
1=1        l<i<j    -^    -" 


E(Y)  =  Z       B.x.  +    E  E   B..X.X.  +    E  E  6..x.x.(x.  -  x.) 

i=l   ^  ^  l<i<j    '^     '    3    l<i<j  '^     "■    ^       '            3 

q 

+   EEE  B..,x.x.x.  ,  for  d  =  3.              (1.4) 

-.  ■  .  ■  .^  Ilk  Ilk 

l«;i<:]<k  ■'          -• 


The  fourth  degree  canonical  polynomial  in  q  components  is 
given  in  Cornell  (1981,  p.  64).   The  general  canonical  poly- 
nomial of  degree  d  >  4   in  q  components  does  not  explicitly 
appear  in  the  literature,  but  is  mentioned  in  Scheffe 
(1958).   If  terms  of  the  form  6ijXiXj(xi  -  Xj)  are  removed 
from  the  full  cubic  model  (1.4),  then  the  remaining  terms 
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make  up  what  is  referred  to  as  the  special  cubic  model.   For 
example,  for  q  =  3  components,  the  special  cubic  model  is 


E(Y)  =  B^x^  f  02^2  -^  ^3^3  ^  ^12^1^2  "^  ^3^1^3 


^  ^23^2^3  -^  H23^l"2"3  * 

Scheffe's  canonical  polynomial  models  are  used  for 
approximating  the  response  surface  in  many  mixture  systems. 
Their  popularity  stems  from  the  ease  in  interpreting  the 
coefficient  estimates,  especially  when  the  models  are  fitted 
to  data  collected  at  the  points  of  the  associated  designs 
(see  Section  1.2.2).   However,  other  models  have  been  intro- 
duced which  seem  to  better  represent  the  response  when  the 
components  have  strictly  additive  blending  effects.   We 
present  some  of  them  now. 

Becker  (1968)  introduced  three  forms  of  homogeneous 
models  of  degree  one  which  he  recommends  instead  of  the 
polynomial  models  when  one  or  more  of  the  mixture  components 
have  an  additive  effect  or  when  one  or  more  components  are 
inert.   A  function  f(x,  y,  ...,  z)  is  said  to  be  homogeneous 
of  degree  n  when  f(tx,  ty,  ...,  tz )  =  t'^f(x,  y,  ...,  z),  for 
every  positive  value  of  t  and  (x,  y,  ...,  z)  *    (0,  0,  ..., 
0) .   These  models,  which  Becker  refers  to  as  models  HI,  H2, 
and  H3 ,  are  of  the  form 


q         q 

HI:   E(Y)  =   E   g.x.  +    E  Z   B..min(x.,  x.)  +  ... 
i=l   ^      l<i<j    ^^      ^    ^ 


■"  ^12. ..q'"^"^^!'  ^2 '^q^  ' 


H2:   E(Y)  =  Z       g  x.  +    X  Z  e..x.x./(x.  +  x.)^  ■'■  +  ... 

i=i  ^  ^   i<i<j  ^3^3   ^   :i 

^  ^12...q'^l'^2---^q/(^l  "^  ^2  "^  '••  "^  ^q^"^"^' 

H3:   E(Y)  =   Z   g.x.  +     Z  Z  6  .  .  (x  .  x  .  ) -^Z  ^  +  ... 
i=l   ^  ^     l<i<j   ^313 

■^  ^12...q^^l'^2*'*^q^ 

Each  term  in  the  H2  model  is  defined  to  be  zero  when  the 
denominator  of  the  term  is  zero. 

Draper  and  St.  John  (1977)  suggest  a  model  which  in- 
cludes inverse  terms,  1/xj^,  in  addition  to  terms  in  the 
Scheffe  polynomials.   Such  a  term  is  used  to  model  an 
extreme  change  in  the  response  as  x^   approaches  zero.   The 
experimental  region  of  interest  is  assumed  to  include  the 
region  near  the  zero  boundary  (x^  =  0),  but  does  not  include 
the  boundary  itself.   One  example  of  this  type  of  model  is 
the  Scheffe  linear  polynomial  model  with  inverse  terms 


E(Y)  =  z  g.x.  +  z  e  .X.  -^ 

i=l   ^  ^   i=l  "^  ^ 
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Another  model  form  that  is  useful  in  the  study  of  the 
response  in  a  mixture  system  is  the  model  containing  ratios 
of  the  component  proportions.   A  term  such  as  x.^/x^    measures 
the  relationship  of  x^^  to  x^  rather  than  the  percentage  of 
each  in  the  blends.   Snee  (1973)  points  out  that  the  ratio 
model  presents  a  useful  alternative  to  the  Scheffe  and 
Becker  models  in  that  the  ratio  model  describes  a  different 
type  of  curvature.   He  notes  that  the  curvilinear  terms  for 
the  Scheffe  and  Becker  models,  when  plotted  as  a  function  of 
X£,  are  symmetric  functions  about  Xj^  =  1/2,  whereas  the 
ratio  term  Xj^/xj  is  a  monotone  function  when  plotted  against 

Xi- 

The  terms  in  the  ratio  models  may  also  contain  sums  of 
the  components.   For  example,  with  q  =  3  components,  we 
might  express  the  second  order  model 


q-1  q-1  q-1 

E(Y)=e-+   EB.Z.+    ZZ   B..Z.Z.+   E  Q  . . z. 
°    i=l  ^  ^    l<i<j    ^3  ^  3    i=l   ^^  ' 


(note  the  constant  term)  where  z-^   and  22  are  defined  as 
Zj^  =  X]^/(x2  +  X3)  and  Z2  =  X2/X3.   Some  terms  will  be  unde- 
fined if  points  from  the  boundary  of  the  experimental  sim- 
plex are  included  in  the  design,  for  example,  if  X3  =  0, 
then  Z2  =  X2/X3  is  not  defined.   Snee  (1973)  suggests  adding 
a  small  positive  quantity,  c,  to  each  Xj^  in  this  case. 
This,  of  course,  will  not  be  of  concern  if  the  experimental 
region  is  entirely  inside  of  the  simplex. 
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When  one  or  more  of  the  components  is  inactive,  Becker 
(1978)  suggests  that  a  ratio  model  that  is  homogeneous  of 
degree  zero  in  the  remaining  components  is  appropriate.   In 
three  components,  such  a  model  is  of  the  form 

E(Y)  =  Bq  +  ^^x^/{x^    +   x^)    +  e2X2/(X2  +  x^) 

+  33X3/(x^.  X3)  -H  ^^E^Z  Bijh..(x.,  x.) 

^  ^123^123^^1'  ^2'  ^3^'  ^^'^^ 

where  h^^^  and  hj^23  ^^®  specified  functions  that  are  homoge- 
neous of  degree  zero.   The  function  hj^23  ^^  intended  to 
represent  the  joint  effect  of  all  three  components  simulta- 
neously.  If  in  fitting  a  model  of  the  form  (1.5)  we  deter- 
mine the  model  should  be 

E(Y)  =  3q  +  e-^Xj^/Cx^  +  X2)  +  B;l2^12^^1'  ^2^ 

then  component  three  is  said  to  be  inactive  and  is  removed 
from  further  consideration.   The  model  of  equation  (1.5)  may 
produce  an  extreme  value  near  the  vertices  of  the  simplex 
factor  space  when  there  are  no  inactive  components.   In  this 
case  it  is  suggested  that  a  model  of  the  form  (1.5)  be  used 
only  when  the  proportions  are  restricted  so  that  no  two  of 
the  x^  are  simultaneously  very  close  to  zero.   Beclcer  notes 
that  other  authors  who  have  suggested  ratio  models  have  also 
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used  them  primarily  over  a  subregion  inside  the  simplex 
factor  space.   Apparently  this  is  where  they  are  most  appro- 
priate . 
1.2.2   Experimental  Designs  for  Mixtures 

As  in  the  general  response  surface  problem,  one  of  the 
major  concerns  in  exploring  a  mixture  system  is  that  of 
choosing  the  experimental  design  for  collecting  observed 
values  of  the  response  that  will  be  used  in  fitting  the 
model.   Scheffe  (1958)  proposed  the  {q,m}  simplex  lattice 
designs  for  exploring  the  entire  q-component  simplex  factor 
space.   In  these  designs,  the  proportions  used  for  each 
component  have  the  m  +  1  values  spaced  equally  from  zero  to 
one,  Xj^  =  0,  1/m,  2/m,  ...,  (m  -  l)/m,  1,  and  all  possible 
mixtures  with  these  proportions  for  each  component  are 
used.   The  number  of  design  points  in  the  {q,m}  simplex 
lattice  design  is  ('"*"''  ^  ~  ■'•)  .   The  main  appeal  of  these 
designs  is  that  they  provide  a  uniform  coverage  of  the  fac- 
tor  space.   Another  feature,  which  Scheffe  (1958)  demon- 
strates, is  that  the  parameters  of  the  canonical  polynomial 
of  degree  m  in  q  components  are  expressible  as  simple  linear 
combinations  of  the  true  response  values  at  the  design 
points  of  the  {q,m}  simplex  lattice.  The  {3,2}  simplex 
lattice,  which  consists  of  six  design  points,  is  represented 
on  the  two  dimensional  simplex  in  Figure  1  along  with  the 
triangular  coordinates  (xj^,  X2,  X3). 

Scheffe  (1963)  also  developed  the  simplex  centroid 
designs  consisting  of  2^  -  1  points,  where  the  only  mixtures 
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considered  are  the  ones  in  which  the  components  present 
appear  in  equal  proportions.   That  is,  in  a  q-component 

simplex  centroid  design,  the  design  points  correspond  to  the 

q 
q  permutations  of  (1,  0,  0,  ...,  0),  the  (2)  permutations  of 

q 
(1/2,  1/2,  0,  ,..,  0),  the  (3)  permutations  of  (1/3,  1/3, 

1/3,  0,  ...,  0),  ...,  and  the  point  (1/q,  1/q,  ...,  1/q). 

This  design  alleviates  the  problem  inherent  in  the  {q,m} 

simplex  lattice  designs  of  observing  responses  at  mixtures 

containing  at  most  m  components.   To  give  an  example,  the 

q  =  3  simplex  centroid  design  is  made  up  of  2-^  -  1  =  7 

design  points,  and  is  equivalent  to  the  {3,2}  simplex 

lattice  design  augmented  by  the  center  point  (xj^,  X2,  X3)  = 

(1/3,  1/3,  1/3).   This  design  is  represented  in  Figure  2. 

Scheffe  (1963)  mentions  that  the  number  of  parameters 

in  the  polynomial  model  of  the  form 


q         q  q 

E(Y)  =   E   3.x.  +   E  E  e..x.x.  +   E  E  Z  e..,x.x.x, 

i=i    '  '     i<i<j     ^^  ^  ^     i<i<j<k    i:k  1  :  k 


^  h2...q'^1^2  •••  ^q  ^^'^^ 


is  2*5  -  1  and  therefore  these  models  have  a  special  rela- 
tionship with  the  simplex  centroid  design  in  q  components. 
This  relationship  is  that  the  number  of  terms  in  the  model 
equals  the  number  of  points  in  the  design  and  as  a  result 
the  parameters  in  model  (1.6)  are  expressible  as  simple 
functions  of  the  responses  at  the  2*5-1  points  of  the  sim- 
plex centroid  design.   Polynomial  models  of  the  form  (1.6) 
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Figure    1.      The    {3,2}       simplex    lattice    design. 
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Figure  2.   The  q  =  3  simplex  centroid  design. 


15 

therefore  are  natural  models  to  fit  using  the  simplex  cen- 
troid  design. 

Ratio  models  may  be  desirable  when  the  interest  in  one 
or  more  of  the  mixture  components  is  in  terms  of  their  rela- 
tionship to  one  another,  rather  than  in  terms  of  their  per- 
centages in  blends.   Kenworthy  (1963)  proposed  factorial 
arrangements  for  ratio  variables.   An  example  of  the  use  of 
ratios  is  the  following  three  component  system  in  which  the 
mixture  components  are  constrained  by  the  upper  and  lower 
bounds : 


.2  <  X   <  .4,      .2  <  X   <  .4,      .3  <  x   <  .5.     (1.7) 


The  ratio  variables  of  interest  are  z^    =   X2/X]^  and 
^2  ~  ^2/^3'  ^^^    ^^    ^^   desired  to  fit  either  a  first  or  a 
second  order  polynomial  model  in  z^   and  22^   For  such  a 
problem,  we  can  define  a  2^  and  a  3^  factorial  design  that 
can  be  used  for  fitting  the  first  and  second  order  poly- 
nomial models,  respectively,  by  taking  as  design  points  the 
intersection  of  rays  passing  from  two  of  the  three  vertices 
of  the  two-dimensional  simplex  through  the  region  of 
interest  defined  by  the  constraints  (1.7).   Kenworthy 's  2 
factorial  design  is  shown  in  Figure  3. 

Becker  (1978)  uses  rays  extending  from  one  or  more 
vertices  of  the  simplex  factor  space  to  the  opposite  bound- 
aries in  developing  "radial  designs."   These  designs  are 
suggested  for  detecting  the  presence  of  an  inactive 
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Figure  3.   Kenworthy's  2^    factorial  design 


component  or  in  another  case  a  component  which  has  an  addi- 
tive effect,  when  models  containing  ratio  terms  that  are 
homogeneous  of  degree  zero  are  fitted. 

McLean  and  Anderson  (1966)  suggest  an  algorithm  for 
locating  the  vertices  of  a  restricted  region  of  the  simplex 
factor  space  which  is  defined  by  the  placing  of  upper  and 
lower  bounds  on  the  mixture  component  proportions.   The 
vertices  of  the  factor  space  and  convex  combinations  of  the 
vertices  are  the  candidates  for  design  points  for  fitting  a 
first  or  second  degree  polynomial  model  in  the  mixture  com- 
ponents.  One  drawback  of  the  "extreme  vertices"  design  is 
that  the  design  points  are  not  uniformly  distributed  over 
the  factor  space  resulting  in  an  imbalance  in  the  variances 
of  Y(x),  see  Cornell  (1973). 
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Another  method  that  has  been  suggested  for  studying  the 
response  over  a  sub-region  of  the  simplex  mixture  space  is 
to  transform  the  q  mixture  components  into  q  -  1  independent 
variables.   Transforming  to  an  independent  variable  system 
was  first  suggested  by  Claringbold  (1955)  and  later  proposed 
by  Draper  and  Lawrence  (1965a,  1965b)  and  Thompson  and  Myers 
(1968).   Standard  response  surface  polynomial  models  in  the 
transformed  variables  can  be  fitted  to  data  values  collected 
on  standard  designs  and  a  design  criterion  such  as  the  aver- 
age mean  square  error  of  the  response  can  be  employed  when 
distinguishing  between  designs.   Thompson  and  Myers  (1968) 
suggest  the  use  of  rotatable  designs  (see  also  Cornell  and 
Good,  1970). 

Designs  other  than  rotatable  designs,  such  as  multiple 
lattices  and  symmetric-simplex  designs,  to  name  a  few,  have 
been  suggested  in  the  literature  for  fitting  models  to  a 
mixture  system  which  may  be  appropriate  depending  on  par- 
ticular experimental  situations.   However,  as  the  intent 
here  is  not  to  give  an  exhaustive  list  but  only  a  sampling 
of  available  designs,  we  shall  not  discuss  designs  further 
but  instead  state  the  purpose  of  this  work. 

1.3   The  Purpose  of  this  Research; 
Investigation  of  Procedures  tor  Testing  "a"  Model 
Fitted  in  A  Mixture  System  for  Lack  of  Fit 

A  common  problem  in  modeling  the  response  in  a  mixture 

system  is  that  of  detecting  lack  of  fit,  or  inadequacy,  of  a 

fitted  model  of  the  form  E(Y)  =  Xg   when  the  true  model  is 

of  the  form  E(Y)  =  XQ^   +   X  B2-   The  statistical  literature 
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suggests  several  procedures  for  testing  lack  of  fit,  which 
will  be  described  in  Chapter  Two.   In  general,  the  authors 
of  these  procedures  are  not  specific  in  stating  hypotheses 
to  be  tested  and  do  not  adequately  discuss  the  power  of 
their  procedures. 

The  major  purpose  of  this  research  is  to  investigate 
the  power  of  two  of  the  testing  procedures  appearing  in  the 
literature  in  detecting  the  inadequacy  of  a  fitted  model 
when  the  general  form  of  the  true  model  is  specified.   Our 
findings  for  a  "check  points"  lack  of  fit  testing  procedure 
are  presented  in  Chapter  Three  while  Chapter  Four  contains 
findings  for  a  "near  neighbor"  lack  of  fit  testing  proce- 
dure.  For  both  procedures,  explicit  formulas  for  the  power 
of  the  test  are  given  in  terms  of  cumulative  probabilities 
of  either  the  noncentral  F  or  doubly  noncentral  F  distribu- 
tion, which  are  derived  by  assuming  that  the  response  obser- 
vations are  independent  and  normally  distributed.   Addition- 
ally, we  propose  methods  for  maximizing  the  power  of  the 
testing  procedures.   In  the  final  chapter,  we  make  some 
concluding  comments  concerning  both  of  these  procedures. 


CHAPTER    TWO 
LITERATURE    REVIEl'V — TESTING    FOR    LACK    OF    FIT 

2.1   Introduction 


Let  us  return  to  the  general  response  surface  problem 
and  assume  the  true  response  is  to  be  approximated  by 
fitting  a  model  of  the  form 


E(Y)  =  XSj^  (2.1) 


where  Y  is  an  Nxl  vector  of  observable  response  values,  X  is 
an  Nxp  matrix  of  known  constants,  and  ij^  is  a  px  1  vector  of 
unknown  regression  coefficients.   We  wish  to  consider  the 
situation  in  which  the  true  model  contains  terms  in  addition 
to  those  in  the  fitted  model.   Then  the  true  model  has  the 
form 


E(Y)  =  Xe^  +  Y.^_^  (2.2) 


where  X2  is  an  Nxp2  matrix  of  known  constants,  and  e.2  is  a 
P2>:1  vector  of  unknown  regression  coefficients.   We  assume 

that  the  vector  Y  has  the  normal  distribution  with 

2 
var(Y)  =  0  Ij^  . 

It  is  desirable  to  determine  the  suitability  of  the 

fitted  model  given  by  Eq.  (2.1)  when  in  reality  the  true 

model  is  of  the  form  given  by  Eq.  (2.2).   The  process  of 
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making  this  determination  is  referred  to  as  testing  for  lack 
of  fit  of  the  fitted  model. 

There  are  three  general  approaches  to  testing  for  lack 
of  fit.   The  first  approach  requires  that  there  be  replicate 
observations  of  the  response  at  one  or  more  design  points, 
and  involves  partitioning  the  residual  sum  of  squares  from 
the  fitted  model  into  a  sum  of  squares  due  to  lack  of  fit 
and  a  sum  of  squares  due  to  pure  error.   A  large  value  for 
the  ratio  of  the  mean  square  due  to  lack  of  fit  to  the  mean 
square  due  to  pure  error  provides  evidence  for  lack  of  fit. 

If  replicate  observations  are  not  available  then  the 
above  approach  to  testing  for  lack  of  fit  cannot  be  used. 
Green  (1971),  Daniel  and  Wood  (1971),  and  Shillington  (1979) 
have  proposed  alternative  methods  that  are  applicable  in 
this  case.   Their  approach  is  to  group  values  of  the 
response  which  are  observed  at  similar  settings  of  the 
independent  variables  and  to  call  these  grouped  values 
"pseudoreplicates"  or  "near  neighbor  observations."   They 
then  treat  these  pseudoreplicates  as  they  would  treat  true 
replicates  to  form  statistics  for  lack  of  fit  testing, 
although  arriving  at  their  respective  statistics  through 
different  approaches. 

A  third  approach  to  testing  for  lack  of  fit  involves 
the  use  of  "check  points."   In  this  method  a  model  of  the 
form  (2.1)  is  fitted  to  data  at  the  design  points  and 
additional  observations  are  collected  at  other  points  in  the 
experimental  region.   The  additional  points  other  than  the 
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design  points  are  called  check  points,  and  the  data  at  these 
check  points  are  not  used  in  fitting  the  model.   Lack  of  fit 
is  tested  by  comparing  the  values  of  the  response  observed 
at  the  check  points  to  the  values  of  the  response  which  the 
fitted  model  predicts  at  these  same  check  points. 

We  now  discuss  the  first  method  mentioned  above  of 
testing  for  lack  of  fit  which  involves  partitioning  the 
residual  sum  of  squares. 

2.2   Partitioning  the  Residual  Sum  of  Squares 

The  method  for  testing  lack  of  fit  which  makes  use  of  a 
partitioning  of  the  residual  sum  of  squares  from  the  fitted 
model  requires  there  be  replicate  observations  of  the 
response  at  some  of  the  design  points  (Draper  and  Smith, 
1981,  p.  120).   When  a  model  of  the  form  (2.1)  is  fitted, 
the  residual  sum  of  squares  is  defined  as 

n. 

SSE  =   E    Z   (Y. .  -  Y. )^ 
i=l  j=l    ^J 

=  Y'(Ifj  -  X(X'X)~''"X'  )Y 

where  n  is  the  number  of  distinct  design  points,  nj^  >  1  is 

the  number  of  replicate  observations  at  the  ith  design 

point,  Yj^^  is  the  jth  observed  value  of  the  response  at  the 

ith  design  point,  Y-  is  the  value  which  the  model  of  the 

form  in  Eq.  (2.1),  fitted  by  ordinary  least  squares 

techniques,  predicts  for  the  response  at  the  ith  design 

n 
point,  and  N  =  i:   n.  .   Using  the  replicated  observations 

i=l   ^ 
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only,  a  pure  error  sum  of  squares  can  be  calculated  as 

n  . 

"     ^         -    2 
SSE      =   E    I   (Y. .  -  Y.  )   , 
Pl^re    ^^^  j^^    1]    ^1.'   ' 

where  Y^ ^  is  the  average  of  the  values  of  the  response 
observed  at  the  ith  design  point.   The  sum  of  squares  due  to 
lack  of  fit  can  be  obtained  by  taking  the  difference 


SS^^^  =  SSE  -  SSE 

LOF  pure 


This  partitioning  of  the  residual  sum  of  squares  is 
displayed  in  the  analysis  of  variance  table  in  Table  1, 


Table  1.   Analysis  of  Variance — 
Partitioning  the  Residual  Sum  of  Squares. 


Source 

Sum 

Degrees 

Mean 

of  Variation 

of  Squares 

of  Freedom 

Square 

Regression 

b|X'Y  -  (I'Y)^ 

/N 

p  -  1 

Residual 

SSE 

N  -  p 

MSE 

Pure  Error 

^^^pure 

N  -  n 

MSEpure 

Lack  of  Fit 

2  2  LOF 

n  -  p 

^^LOF 

Total ( corrected ) 

Y' 

Y  -  (l'Y)V 

N 

N  -  1 

bj^  represents  the  ordinary  least  squares  estimator  of  6  in 
model  (2.1),  b,  =  (X'X)~  X'Y,  and  1  is  an  Nxl  vector  of 
ones. 
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To  test  the  hypothesis  of  zero  lack  of  fit,  that  is 

Hq:   lack  of  fit  =  0  or  E(I)  =  X^x'  ^^  F  statistic  is  formed 

^  =   MSe''  (2.3) 

pure 

which  possesses  a  central  F  distribution  if  the  true  model 
is  of  the  form  (2.1),  but  has  a  noncentral  F  distribution  if 
the  true  model  is  of  the  form  (2.2).   In  other  words 


F  ~  F 

n-p,N-n 


under  H  :   E(Y)  =  X3-,^  ,  and 


und 


F  ~  F' 

n-p,N-n;A - 


er  H  :   E(Y)  =  XB,  +  X232  '  where  X 2  is  the  noncentrality 
parameter  X   =  ^^(X2-XA) ' (X2-XA)B2/2a  '  ^"^  ^  =  (X'X)"^X'X2' 
Under  H3,  E(MS^Qp)  =  a^  +  e^(X2  -  XA)'(X2  -  XA)B2/(i^-P)  ^"^^ 
E(MSE   j.g)  =  a^  (Draper  and  Smith,  1981,  p.  120),  hence  Hq 
is  rejected  in  favor  of  H^  if  the  value  of  F  in  (2.3) 
exceeds  the  upper  100a  percentage  point  of  the  central  F 
distribution,  Fa;n-p,N-n*   ^^^"  ^0  ^^  rejected,  we  conclude 
that  a  significant  lack  of  fit  is  present. 

Draper  and  Herzberg  (1971)  demonstrated  that  the  lack 
of  fit  sum  of  squares  can  be  partitioned  into  two 
statistically  independent  sums  of  squares,  SSj^j  and  53^2' 
when  there  are  replicate  observations  at  the  center  of  the 
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response  surface  design  and  when  the  basic  design  without 

center  points  is  nonsingular.   If  the  true  model  and  the 

fitted  model  are  of  the  same  form  as  in  equation  (2.1)  then 

the  two  F  ratios  F^ ,  =  rss^,/(n  -  p  -  1)1/MSE     and 

LI    '-   LI  J '    pure 

Fl2  =  SSL2/^S^pure  ^^^  both  distributed  as  central  F  random 
variates,  with  respective  numerator  and  denominator  degrees 
of  freedom  (n  -  p  -  1),  (N  -  n)  for  F^j^  and  1,  (N  -  n)  for 
Fl2'   ^f  ^^^  true  model  is  of  the  form  shown  in  equation 
(2.2),  then  F^j^  and  Fl2  ^^^    both  distributed  as  noncentral  F 
random  variates.   The  expected  values  of  SSlj^  and  83^2  ^^^ 
used  to  show  what  functions  of  e.2  ^^®  testable  with  F^jl  and 

fL2- 

Two  examples  are  presented  by  Draper  and  Herzberg  to 

illustrate  this  testing  for  lack  of  fit.   The  first  example 

makes  use  of  a  first  order  orthogonal  design  in  k  factors 

augmented  with  center  point  replicates  for  fitting  a  first 

order  polynomial  model .   If  the  true  model  is  of  the  second 

order,  then  Ft2  can  be  used  to  test  a  hypothesis  concerning 

the  parameters  associated  with  the  pure  quadratic  terms  in 

the  model.   If  all  such  parameters  are  zero,  then  Fj^-j^ 

provides  a  check  on  interaction  terms.   The  second  example 

illustrates  the  fitting  of  a  second  order  polynomial  model 

to  a  second  order  design  with  all  odd  design  moments  of 

order  six  or  less  zero.   If  the  true  model  is  third  degree, 

then  Fj^j^  can  be  used  to  test  the  significance  of  the  third 

order  terms,  while  Fl2  tests  terms  of  order  greater  than 

three.   The  partitioning  of  SS^qf  into  SSlj^  and  SSl2  ^^^    ^^^ 
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corresponding  tests  of  hypotheses  are  also  given  in  Myers 
(1971,  p.  114-119),  for  the  special  case  of  fitting  a  first 
order  polynomial  model  to  a  2*3  factorial  or  a  fraction  of  a 
2*3  factorial  design  augmented  with  center  point  replicates 
and  the  true  model  is  of  the  second  degree. 

A  more  complete  partitioning  of  the  lack  of  fit  sum  of 
squares  in  an  attempt  to  obtain  a  more  detailed  diagnosis  of 
the  lack  of  fit  of  the  fitted  model  is  given  in  a  technical 
report  written  by  Khuri  and  Cornell  (1981).   The  lack  of  fit 
sum  of  squares,  which  has  n  -  p  degrees  of  freedom,  is 
partitioned  into  n  -  p  independent  sums  of  squares,  each 
having  one  degree  of  freedom.   The  expected  values  of  these 
single  degree-of-freedom  sums  of  squares  are  used  to 
identify  at  most  n  -  p  linearly  independent  causes  for  the 
lack  of  fit  variation.   Tests  of  significance  are  performed 
on  the  assumed  contributing  causes.   This  method  enables  the 
screening  of  all  subsets  of  2.2  in  order  to  identify  those 
subsets  which  are  most  responsible  for  lack  of  fit  of  the 
fitted  model. 

We  shall  now  discuss  the  second  general  approach  used 
in  lack  of  fit  testing,  which  is  to  test  for  lack  of  fit  by 
making  use  of  response  values  observed  at  points  which  are 
near  neighbors  in  the  factor  space  when  true  replicate 
observations  are  not  available. 
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2.3   Testing  for  Lack  of  Fit  Without 
Replicated  Observations — Near  Neighbor  Procedures 

Green  (1971)  suggests  the  following  approach  when 
testing  for  lack  of  fit  if  there  are  no  design  points  at 
which  replicate  observations  of  the  response  are 
available.   The  N  observed  values  of  a  response,  Y, 
considered  a  function  of  only  one  variable,  x,  are  divided 
into  g  groups,  by  grouping  observations  which  have  similar 
values  of  x.   Green  hypothesizes  a  model  of  the  form  Y=  Ha  + 
£.,  where  X  is  an  Nxl  vector  of  observable  responses,  H  is  an 
Nxm  matrix  whose  columns  correspond  to  known  functions  of 
the  variable,  x,  a.  is  an  mxl  vector  of  unknown  regression 
coefficients,  and  e.  is  the  Nxl  vector  of  random  errors, 
e  ~  N^(0,  a^Ijj). 

Green's  method  assumes  that  the  vector  of  differences 
( EY  -  Hg.)  can  be  well  approximated  by  a  dth  order  polynomial 
in  X  within  each  of  the  g  groups,  d  >  1.   An  alternative 
model  of  the  form 


Y  =  H  V  +  n   +  £ 


is  given,  where  S.  is  distributed  as  N^(Q,    a^I^)/  Hj^  is  an 
Nx[g(d  +  1)  +  mj^]  matrix  of  known  constants,  ii  is  a 
[g(d  +  1)  +  m-j^]xl  vector  of  regression  coefficients,  and  u., 
as  Green  states  is  "a  small  vector."   The  first  g(d  +  1) 
columns  of  H^  correspond  to  the  polynomial  terms  for  the  g 
groups  (with  (d  +  1)  terms  for  each  group),  the  rightmost 
m   <  m  columns  in  H-j^  correspond  to  terms  that  are  in  the 
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fitted  model,  but  are  not  represented  among  the  g(d  +  1) 
polynomial  terms  in  the  alternative  model. 

Under  the  assumption  that  n    =   Q,    the  presence  of  lack 
of  fit  is  tested  by  using  the  test  statistic: 

Y'  [H  (H'H  )"-'-H|  -  H(H'H)"-'-H']Y/[g(d  +1)  +  m^  -  m] 

F  = ^^^ . 

Y'[I  -  H^(HjH^)"-^H|]Y/[N  -  g(d  +  1)  -  m^] 

(2.4) 

This  statistic  is  of  the  same  form  as  the  F  statistic  used 
in  the  standard  multiple  regression  test  of  a  postulated 
model  against  a  more  general  one  which  includes  the 
postulated  model  as  a  special  case.   Lack  of  fit  is 
suspected  if  the  calculated  F  ratio  in  (2.4)   is  greater 

than  Fcc;g(d+l)+mi-m,  N-g(d+l)-mi  "^^^^  ^his  latter  quantity 
is  the  upper  100a  percentage  point  of  the  central  F 
distribution. 

Green  notes  that  when  there  is  no  lack  of  fit,  the 
quadratic  forms  Y' [  H  ,  (H 'H  ,  )  ""'"H'  -  H(H'H)~  H'JY   and 
Y'[l  -  H  (H'H  )"-'-H']y   are  distributed  independently  as 
a\^   with  g(d  +1)  +  m-j^  -  m  and  N  -  g(d  +1)  -  mj^  degrees  of 
freedom,  respectively.   In  this  case  the  F  ratio  in  (2.4) 
possesses  a  central  F  distribution.   If  there  is  lack  of  fit 
on  the  other  hand,  then  these  two  quadratic  forms  are 
distributed  as  noncentral  chi-squares,  multiplied  by  a'', 
with  respective  noncentrality  parameters 
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?j_  =  [Hj_v  +  n]  •[Hj^(HJ_Hj^)"^H'^  -  H(H'H)""'"H' ][h^v  +  n] 
and   C   =  ri_'[l  -  H  (H'H  )~  H']ri  .   Thus  the  assumption  that 
n  =  0  can  affect  the  power  of  the  test,  since  if  n  *  0  ,  the 
expected  value  of  MSE  is  greater  than  o^ ,    where  MSE  is  the 
quadratic  form  in  the  denominator  of  the  F  ratio.   Hence  if 
r\    *    0    ,    the  probability  of  calculating  a  large  F  value  is 
reduced,  and  we  are  less  likely  to  detect  lack  of  fit  using 
an  upper  tailed  rejection  region. 

Daniel  and  Vtood  (1971)  suggest  another  method  for  lack 
of  fit  testing  when  replicated  observations  of  the  response 
are  not  available.   They  make  use  of  "near  replicates"  to 
obtain  an  estimate  of  a,  which  is  the  standard  deviation  of 
the  observable  responses  in  the  true  model.   The  value  of 
the  estimate  a  is  compared  to  the  square  root  of  the 
residual  mean  square  from  the  analysis  of  the  fitted 
model.   Lack  of  fit  is  indicated  if  the  square  root  of  the 
residual  mean  square  is  large  compared  to  the  estimate  a. 
To  determine  when  observations  are  near  replicates  so  that 
an  estimate  of  a  can  be  found,  they  define  the  squared 
distance  between  any  two  data  points,  j  and  j',  to  be 
measured  by 

where  Xj_j  and  x^ji  are  the  values  of  the  ith  independent 
variable  corresponding  to  the  observations  yj  and  Yj > ' 
respectively,  i  =  1,  2,  ...,  K,  and  b^  is  the  ordinary  least 
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squares  estimate  of  the  ith  regression  coefficient.   In  the 
denominator,  s   is  the  square  root  of  the  residual  mean 
square  for  the  fitted  model. 

To  obtain  an  estimate  of  a  from  near  replicates,  let 
Aj^d  =  |d4  -  d-j,|,  n  =  1,  2,  ...,  (2),  where  dj  and  dj  1  are 
the  residuals  at  points  j  and  j',  respectively,  and  where 
there  are  N  data  observations  in  the  experiment.   Since  the 
expected  value  of  the  range  for  pairs  of  independent 
observations  from  a  normal  distribution  is  1.128a,  a  running 
average  of  the  A^d's  is  calculated  and  their  average  is 
multiplied  by  .886  =  (1/1.128)  to  get  a  running  estimate, 
Sj^,  of  a.   That  is,   s^  =  .836  ^  A^d/n  .   The  closest  pair 
of  observations  as  judged  by  D?-;  1  is  used  to  begin  the 

running  estimate,  the  next  closest  pair  (next  "nearest 
neighbors")  is  used  for  A2d,  and  the  procedure  continues 
until  s^  "stabilizes."   The  stabilized  value  of  s^  is  used 
to  estimate  a. 

A  third  method  for  testing  for  lack  of  fit  without 
replication  is  given  by  Shillington  (1979).  The  fitted 
model  is  of  the  form 

Y   =  X3   +  e  (2.5) 

where  Y  (Nxl),  X  (N^p),  and  §.  (p^l)  are  defined  as  in  (1.2) 

and  e  ~  '^m^-'  °^^N^  *  '^^^    ^^^^  ^°^  "''^^^  °^  ^"""^  °^  ^^® 
fitted  model  is  a  test  for  whether  the  true  model  has  the 

form 

Y   =  X6   +6   +  £  , 
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where  6  (Nxl)  is  a  fixed  effect  quantifying  the  departure  of 
(2.5)  from  the  true  model. 

Shillington  assumes  that  the  data  can  be  grouped  into  g 
cells,  with  nj  observations  in  the  jth  cell,  determined  in 
advance.   Letting  Cj  refer  to  the  jth  cell,  j  =1,  2,  ..., 
g,  a  vector  of  cell  averages  is  written  Yp  (gxl),  where  the 
jth  element  of  Y^  is  the  average  of  the  observed  responses 
in  C-j .   The  matrix  X^  of  independent  variables  associated 

with  Y^  is  the  gxp  matrix  where  the  elements  in  the  jth  row 

n  . 
3 
are  x'  .  =   Z      x'.  ./n.    ,  that  is,  row  j  of  Xp  is  the  row 
-O     i^i  -1]  J 

vector  x' .  .   The  matrix  Xp  is  assumed  to  be  of  full  rank 

p  <  g.   Also  within  each  cell  are  defined  the  differences 

W. .  =  Y. .  -  Y  .  ,  i  e  C.  ,  j  =  1,  2,  ...,  g,  where  Y  .  is 
ID     ID     -D     _    D  -D 

the  jth  element  of  Yp. 

The  two  independent  data  sets,  Yp  and  {W- •}  with  g  and 
N  -  g  degrees  of  freedom,  respectively,  are  used  to  find  two 
independent  estimates  of  a^.   The  first  estimate  is  written 
as 


MSE^  =   Z   n.(Y  .  -  x'.g„)V(g  "  P)  / 

where  g_g  is  the  weighted  least  squares  estimate  of  g.  using 
the  regression  of  cell  means,  Y^^,,  on  X^.   The  second 
estimate  of  a      uses  the  within  cell  deviations  on  cell 
means,  {Wj^^},  and  is 

g   "d       .  , 

MSE,,  =   2     Z    (W..  -  W.  )^/(N  -  g  -  r), 
^^    j=l   i=l     ^^     ^^ 
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where  r  is  the  rank  of  an  N^p  matrix  v/ith  rows  equal  to 


x* 


_  x«   ,  i  e  C,  j  =  1,  2,  ..,  g.    If  the  matrix  of 
-ij    -O        D 

independent  variables,  corrected  for  cell  means,  is  of  full 
rank,  then  r  =  p.   Here  VJ^-  is  the  estimate  of  W^j  from  the 
regression  of  cell  residuals  i^W^^j}  on  the  associated  vectors 
of  independent  variates,  x! .  -  x'.  . 

If  the  fitted  model  is  the  correct  model,  then  MSEg  and 
MSE^  are  independent  estimates  of  a^  and  the  ratio  MSEg/MSE^ 
is  an  F  statistic  with  g  -  p  and  N  -  g  -  r  degrees  of 
freedom.   When  all  observations  in  a  cell  have  the  same 
settings  of  the  independent  variables,  that  is,  the 
observations  are  truly  replicates  for  all  cells,  then  this  F 
statistic  is  identical  to  the  F  statistic  in  the  usual  lack 
of  fit  test  in  which  the  residual  sum  of  squares  is 
partitioned  into  lack  of  fit  and  pure  error  sums  of  squares, 
as  given  in  Draper  and  Smith  (1981,  p.  120). 

If  the  true  model  is  Y  =  XB  +  6  +  e  ,  however,  and  if 
we  let  X'6   =0   and  <^^    =   5'6/N,  then 


2  +  ?1[I  -  X^(X'X^)"^X']^„/(g-p) 


E(MSEg)  =  a-  +  6^[I  -  X^{X^X^)       X^]^^/ 


n  . 

3 


where  6   (g^l)  has  jth  component  equal  to  I         ^^^/n^  • 
-B  .^^    13  3 

Furthermore,  with  this  latter  true  model  form 


E(MSE^)  =  a2  +  6^j(l  -  X^(X^  X^)  "^X^)  6^/(N  -  g  -  r) 
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where  §_^   has  the  components  5^^-  -  S"-,  i  e  c.;,  j  =  1,  2, 
...,  g.   The  matrix  Xr.,  (Nxp)  has  the  rows  x!  .   -  x'  .  , 

w  _  1  -|      _ ,  j  ' 

i  e  C.,  j  =  1,  2,  ...,  g.   The  power  of  the  F  test, 

F  =  MSEg/MSE^,  depends  on  the  relative  bias  of  the  estimates 

of  a^,  that  is,  the  biases  in  MSEg  and  MSE^. 

Shillington  states  that  the  power  of  the  F  test  which 
makes  use  of  F  =  MSEg/MSE;^  is  maximized  by  forming  cells  so 
that  the  bias  of  E(MSE^)  is  minimized.   This  is  the  same  as 
forming  cells  so  that  the  within  cell  variation  in  6  is 
minimized.   Shillington  (1979,  p.  141)  also  states, 
"Observations  with  near  covariate  (independent  variable) 
values  might  be  expected  to  have  similar  6  values,  since  we 
assume  that  §_   varies  in  some  continuous  but  unknown  fashion 
with  X.   This  justifies  the  usual  procedure  of  forming 
groups  by  collapsing  observations  with  adjacent  covariate 
values.   Indeed,  if  covariates  do  not  vary  within  cells  we 
have  the  usual  lack  of  fit  test  and  maximum  power." 

By  imposing  a  further  structure  on  the  form  of  §_,    it  is 
shown  that  if  the  F  test  has  an  upper  tailed  rejection 
region,  the  power  is  maximized  by  selecting  the  group  sizes 
as  n-;  =  2,  j  =  1,  2,  ...,  g.   Finally,  Shillington  suggests 
that  in  the  presence  of  more  than  one  independent  variable 
problems  in  grouping  may  arise,  and  in  this  case  it  may  be 
wise  to  perform  a  different  lack  of  fit  test  for  each 
parameter.   Following  this  approach,  an  example  is  given 
which  suggests  testing  lack  of  fit  for  each  of  the  p 
independent  variables  separately  may  be  more  powerful  than 
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trying  to  form  groups  based  on  all  independent  variables  at 
once. 

In  summary,  all  the  approaches  we  have  discussed  for 
testing  for  lack  of  fit  when  replicate  observations  of  the 
response  are  not  available  at  any  of  the  settings  of  the 
independent  variables  make  use  of  grouping  the  observed 
response  values  according  to  similar  values  of  the 
independent  variables.   The  observations  falling  in  such 
groups  are  referred  to  as  "pseudoreplicates"  or  "near 
neighbor  observations."   These  pseudoreplicates  are  used  to 
estimate  the  true  variance  of  the  observations,  a^,  but  a 
completely  unbiased  estimate  of  o^  cannot  be  attained  unless 
true  replicate  observations  are  available.   In  each  case, 
the  power  of  the  lack  of  fit  testing  procedure  is  reduced 
because  an  unbiased  estimate  of  o^  is  not  attainable.   We 
now  turn  to  the  use  of  check  points  for  lack  of  fit  testing. 
2.4   Testing  for  Lack  of  Fit  with  Check  Points 

An  alternative  to  the  two  approaches  to  lack  of  fit 
testing  already  discussed  is  the  method  which  makes  use  of 
check  points.   We  assume  a  model  of  the  form  E(Y)  =  XB,  ,  as 
given  in  (2.1),  is  fitted  in  a  response  surface  system,  but 
that  the  true  model  is  of  the  form  E(Y)  =  Xg   +  X  3    as 
given  in  (2.2).   The  parameters,  dif  in  the  fitted  model  are 
estimated  by  ordinary  least  squares  techniques,  making  use 
of  the  values  of  the  response  observed  at  the  design 
points.   After  the  model  is  fitted,  values  of  the  response 
are  observed  at  additional  points  in  the  experimental  region 
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> 

called  "check  points."   The  observed  response  values  at  the 
check  points  are  compared  to  the  values  which  the  fitted 
model  predicts  at  these  same  check  points.   It  is  important 
to  note  that  the  observed  values  of  the  response  at  the 
check  points  are  not  used  in  fitting  the  model  initially. 

Snee  (1977)  gives  four  methods  of  validating  regression 
models,  one  of  which  is  the  collection  of  new  data  to  check 
predictions  from  a  previously  fitted  model.   In  a  designed 
experiment  these  new  data  take  the  form  of  check  points. 
Snee  suggests  that  the  inclusion  of  a  small  number  of  check 
points  in  any  designed  experiment  is  a  "worthwhile" 
procedure. 

Scheffe  (1958)  proposed  a  test  for  lack  of  fit  when  the 
{3,2}  simplex  lattice  design  is  used  for  fitting  a  second 
order  canonical  polynomial  model  in  three  mixture 
components.   It  is  desired  to  use  the  observed  value  of  the 
response  at  (1/3,  1/3,  1/3)  as  a  check  point  blend.   The 
test  statistic  proposed  is  the  t  statistic  of  the  form 


t  =  — ^ ^ 1 — ,-7o—  (2.6) 


[var(Y  -  Y)]-"-/^ 


where  Y  is  the  observed  value  of  the  response  at  the  check 
point,  and  Y  is  the  value  of  the  response  predicted  at  the 
same  point  by  the  second  order  model  which  is  fitted  by 
ordinary  least  squares  techniques  to  the  observed  response 
values  at  the  six  design  points  of  the  {3,2}  simplex 
lattice.   The  response  value  observed  at  the  point 
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(1/3,  1/3,  1/3)  is  not  used  in  fitting  the  model.   Lack  of 
fit  is  inferred  if  the  absolute  value  of  the  calculated  t 
value  in  equation  (2.6)  is  larger  than  the  corresponding 
tabled  t  value. 

In  the  denominator  of  the  t  test  of  equation  (2.6),  the 
variance  of  the  difference  Y  -  Y  is  shown  to  be 


var(Y  -  Y)  =  var(Y)  +  var(Y) 
=  (44/27r)a^  , 


when  r  replicates  are  taken  at  each  design  point.   The 
estimate  of  the  variance  of  Y  -  Y  is  (44/27r)a2,  where  a^    is 
calculated  from  the  replicated  response  values  at  the  design 
points . 

Scheffe  (1958)  also  alludes  to  a  test  for  lack  of  fit 
when  several  check  points  are  used  simultaneously.   When 
there  are  k  check  points,  the  test  for  lack  of  fit  is  an  F 
statistic  of  the  form 

F  =  V-  (2.7) 

ko^ 

where  d'  =  (Y^  -  Y^,  Y^   -   X  ^,     ...,  Yj^  -  Yj^ )  ,  and  V  =  a  ^Vq  = 
var(d).   Formulas  are  given  for  the  elements  of  Vq  in  the 
special  case  when  the  check  points  are  the  design  points  of 
the  {3,2}  simplex  lattice.   Lack  of  fit  is  suspected  if  the 
calculated  value  of  the  F  statistic  given  in  (2.7)  is  larger 
than  the  corresponding  tabled  F  value. 


36 
Gorman  and  Hinman  (1962)  suggest  the  same  t  test  in 
equation  (2.6)  that  Scheffe  (1958)  suggested  for  a  check 
point  taken  at  (1/3,  1/3,  1/3)  to  test  for  lack  of  fit  in  a 
second  order  polynomial  model  fitted  from  a  {3,2}  simplex 
lattice  design.   They  suggest  using  (1/3,  1/3,  1/3)  as  the 
location  of  the  check  point  because  the  observation  at  this 
point  may  later  be  used  to  fit  the  next  more  complex  model, 
the  special  cubic,  if  the  second  order  model  is  found  to  be 
inadequate.   They  state  that  in  general  for  the  second  order 
polynomial  model  as  well  as  higher  order  models,  check 
points  should  be  taken  in  regions  of  particular  interest,  of 
which  there  are  usually  many  in  any  blending  study. 
Further,  they  suggest  that  the  number  of  check  points 
depends  on  individual  experimental  situations — technical 
background,  precision  required,  cost  of  materials  and 
analyses,  and  probability  of  requiring  a  more  complex 
model.   However,  no  specific  criterion  is  given  by  Gorman 
and  Hinman  for  selecting  the  location  of  the  check  points. 
Gorman  and  Hinman  (1962)  indicate  that  a  t  test  at  a 
check  point  other  than  at  (1/3,  1/3,  1/3)  takes  the  same 
form  as  the  statistic  of  equation  (2.6), 


t  =       Y  -  ^ 


[var(Y)  +  var(Y)]-'-/^ 

with  the  additional  condition  that  if  several  check  points 
are  taken,  say  for  example  k  points,  the  method  of  checking 
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the  fit  is  to  compute  the  t  value  at  each  location  and  refer 
these  calculated  t  values  to  the  100(a/2k)  percentage  point 
of  the  central  t  distribution  rather  than  the  100 (a/2) 
percentage  point. 

Kurotori  (1966)  gives  an  example  of  a  mixture 
experiment  where  the  response  is  the  modulus  of  elasticity 
of  a  rocket  fuel,  which  is  a  mixture  of  three  components, 
binder  (X]^),  oxidizer  {^2)1    and  fuel  (X3).   The  factor  space 
of  feasible  mixtures  is  a  subspace  inside  the  two- 
dimensional  simplex  or  triangle  where  all  three  components 
are  present  simultaneously.   "Pseudocomponents"  are  defined 
and  in  the  pseudocomponent  system  a  special  cubic  model  is 
fitted  to  data  collected  at  the  points  of  the  q  =  3  simplex 
centroid  design  (Figure  4).   A  check  for  adequacy  of  fit  is 
made  by  using  three  check  points  and  the  response  values  at 
the  check  points  are  used  only  for  testing  the  fit  of  the 
model  and  not  for  fitting  the  model  initially. 

The  reason  for  the  choice  of  the  particular  check  point 
locations  by  Kurotori  is  that,  as  he  states,  "They  are  the 
most  remote  mixtures  from  the  seven  design  points."   The 
lack  of  fit  test  is  an  F  statistic  of  the  form 


2 
F  =  -ly-  (2.8) 

a 

2     3         -   9 
where  s   =  z  (Y.  -  Y.)^  ,  for  the  i  =  1,  2,  3  check  points 

.2  .    i=l 
and  a      is  an  estimate  of  measurement  error  from  a  previous 

analysis.   Kurotori  admits  that  the  use  of  the  F  statistic 
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(1,0,0) 


#    Design    Points 
O     Check    Points 


(i'a-'")/  r  {5.0,1) 


(0,1,0)     « 1^^_^ 1  "'" ^     (0,0,1) 


H-i) 


«    =1  [<J<i.-ki  X  -I 

2  \      2    2'  3 


Figure  4.   Kurotori's  rocket  fuel  example, 
^l' r    ^2''    ^^^    ^3*    represent  pseudocomponents 


in  Eq.  (2.8)  for  lack  of  fit  testing  may  be  risky  because 
the  predicted  values  at  the  check  points  are  correlated 
(correlation  of  .5),  although  the  observed  values  are  not 
correlated.   Kurotori  suggests  individual  t  tests  as 
proposed  by  Scheffe  (1958)  might  be  the  preferred  procedure 

Snee  (1971)  repeats  Kurotori's  rocket  fuel  example 
using  the  same  F  test  for  lack  of  fit  as  Kurotori  and  makes 
the  comment  that  the  Y^^'s  at  the  check  points  are 
correlated.   In  stating  that  the  F  test  is  not  an  exact 
test,  he  nevertheless  offers  no  solution  in  the  form  of  an 
exact  test. 
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In  summary,  only  Scheffe  refers  to  an  exact  F  test  when 
several  check  points  are  considered  simultaneously  for 
testing  for  possible  lack  of  fit  of  a  model  fitted  in  a 
mixture  space,  and  his  development  is  limited  to  the  special 
case  where  the  check  points  are  the  design  points  used  to 
fit  the  model  initially.   No  criterion  is  proposed  by 
Scheffe  for  selecting  other  locations  for  the  check  points. 


CHAPTER  THREE 
AN  OPTIMAL  CHECK  POINT  METHOD  FOR  TESTING 
LACK  OF  FIT  IN  A  MIXTURE  MODEL 

3.1   Introduction 


In  Chapter  Three  we  investigate  the  problem  of  testing 
for  lack  of  fit  of  a  linear  model  fitted  in  a  mixture 
space.   The  testing  is  to  be  accomplished  with  the  use  of 
check  points.   We  assume  that  an  experimental  design  is 
specified,  and  that  the  fitted  model  is  of  the  form 


E(Y)  =  X3j^  (3.1) 


where  Y  is  an  Nxl  vector  of  observable  response  values,  X  is 
an  Nxp  matrix  of  known  constants  and  rank  p,  and  3   is  a 
vector  of  p  unknown  regression  coefficients.   The  true  model 
is  assumed  to  be  of  the  form 


E(Y)  =  XBj^  +  X^^^  (3.2) 


where  X2  is  an  Nxp2  matrix  of  known  constants  and  ^2    ^^    ^ 
vector  of  po  unknown  regression  coefficients.   Throughout 
our  development,  we  will  assume  that  the  random  vector  Y  has 
the  normal  distribution  with  variance-covariance  matrix 
equal  to  o  I^. 
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In  our  investigation  we  wish  to  determine  the  proper 

testing  procedure  to  follow  in  deciding  whether  the  fitted 

model  exhibits  lack  of  fit.   In  order  to  optimize  the  lack 

of  fit  testing  procedure,  we  will  determine  the  location  of 

the  check  points  so  that  the  power  of  the  test  is  maximized. 

3.2   Testing  for  Lack  of  Fit  in  the  Presence  of 
an~ External  Estimate  of  Experimental  Error  Variation 

3.2.1   The  Test  Statistic 


We  wish  to  test  the  performance  or  fit  of  a  fitted 
model  in  a  mixture  space  when  the  true  model  possibly 
contains  terms  in  addition  to  those  in  the  fitted  model. 
The  fit  of  the  model  is  to  be  tested  by  a  test  which  makes 
use  of  the  response  values  observed  at  certain  locations 
called  "check  points"  in  the  experimental  region,  by 
comparing  them  to  the  values  which  the  fitted  model  predicts 
at  the  same  check  points.   The  observed  values  at  the  check 
points  are  not  used  for  estimating  the  coefficients  in  the 
fitted  model  and  are  assumed  to  represent  the  values  of  the 
true  surface  at  the  check  points. 

Let  us  define  the  vector  of  differences 

d  =  (Y*  -  Y*) 


(Y*  -  Y*,  Y*  -  Y*,  ...,  Y*  -  Y*)' 


where  Y*,  i  =  1,  2,  ...,  k  are  observed  response  values  at 
k  check  points  and  Y*,  i  =  1,  2,  ...,  k  are  response  values 
predicted  at  the  k  check  points  by  the  fitted  model. 
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Y^  =  x^'b,,  where  b,  is  the  ordinary  least  squares  estimator 
of  3,/  and  where  x*'  is  the  ith  row  of  X*,  the  kxp  matrix 
whose  columns  are  of  the  same  form  as  the  columns  of  X  but 
with  its  rows  evaluated  at  the  k  check  points.   Note  that 
if  3   =  0,  then  E(d)  =0  and  if  ^      *    0,  then 
E(d)  =  (X*  -  X*(X'X)~-''X'X2)62-   Let  V  represent  the 

variance-covariance  matrix  of  the  random  vector  d. 

2 
Then  V  =  a  V»  where 


Vq  =  Ij^  +  X*(X'X)  ■'■X*' 


and  where  Ij^  is  the  identity  matrix  of  order  kxk. 

We  assume  that  an  unbiased  estimate  of  a^  is  available 

and  we  denote  this  estimate  by  a  ^ ,    where  the  subscript  ext 

ext 

"  2 
stands  for  external,  and  o   .  is  independent  of  the  model 

being  fitted.   The  test  statistic  for  the  hypothesis  of  zero 

lack  of  fit  H  :   E(d)  =  0  is 

d'V'-'-d/k 

F  =  :rf-I (3.3) 

''ext 

(see  Scheffe,  1958,  p. 358).   It  will  be  shown  later  in  this 
section  that  the  F  ratio  in  Eq.  (3.3)  possesses  either  a 
central  F  distribution  or  a  noncentral  F  distribution, 
depending  upon  whether  the  true  model  is  represented  by  Eq. 

(3.1)  or  Eq.  (3.2). 

"2  .         . 

The  variance  estimate  a      ^    that  appears  m  equation 

ext 

(3.3)  is  ordinarily  generated  from  replicated  observations 
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at  some  of  the  design  points  in  the  experiment.   We  assume 

*  2 
that  a      ^  IS  a  constant  multiple  of  a  central  chi-square 
ext 

random  variable  with  v  degrees  of  freedom.   This  is  written 
as 


a^  ^  =  SSE     /v 
ext      pure^ 


=  (aVv)(SSEp^^^/a^ 


2     2 
where  ^^E  ^^^/a   ~  x^*   Note  that  SSEpy^g  denotes  the 

portion  of  the  residual  sum  of  squares  due  to  replication 

variation  from  the  fitted  model.   The  residual  sum  of 

squares  from  the  fitted  model  may  be  partitioned  into 

SSEp^j-g  and  SSj^Qp  only  if  replicated  observations  are 

collected  at  one  or  more  design  points.   For  the  case  where 

replicate  observations  are  collected  at  all  of  the  design 

points 


n    n .         _ 

SSE      =   Z    E'''  ( Y.  .  -  Y.  )  , 
PUi^e    i=i  j=i    ^3     1- 

where  n  is  the  number  of  distinct  design  points,  n.  >  2  is 

the  number  of  replicates  at  the  ith  design  point,  Yj^^  is  the 

jth  observation  at  the  ith  design  point,  and  Y.   is  the 

average  of  the  n^  observations  at  the  ith  design  point. 

n 
Here  SSE   j^^  has  v  =  Z  (n.  -  1)   degrees  of  freedom. 

i=l   ^ 
When  the  fitted  model  and  the  true  model  are  of  the 

same  form  as  defined  by  Eq.  (3.1),  the  quantity  d'V~''"d/a^ 
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possesses  a  central  chi-square  distribution  (Searle,  1971, 
p. 57,  Theorem  2).   However,  when  the  true  model  is  of  the 
form  specified  by  Eq.  (3.2),  d'V~''"d/a^  possesses  a 
noncentral  chi-square  distribution.   Thus  when  the  true 
model  is  of  the  form  in  Eq.  (3.1), 


d'V-Va'  ~  X^ 


but  when  the  true  model  is  of  the  form  in  Eq.  (3.2), 


d'V-Vc^2  ^  ^.2^^^ 


where  in  the  second  case  the  noncentrality  parameter  X.  has 
the  form 


Xj_  =  E(d)'VQ^E(d)/2a^ 


=  3^(X*  -  X*A)'Vq^(X*  -  X*A)^^/2a^. 


The  matrix  A  =  (X'X)   X'X„  is  called  the  alias  matrix  and  is 


of  order  pxp  .   In  X-,,  the  matrix  X*  is  of  order  kxp2  a 

has  the  same  relationship  to  X2  as  X*  has  to  X. 

2 
Since  SSE    /a   is  statistically  independent  of 
pure"^ 

-12  .   . 

d'V_  d/a  ,  then  under  model  (3.1)  the  test  statistic 


nd 
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d'V  """d/ka^ 
F  = 


SSE     /va^ 
pure^ 


''ext 

will  have  a  central  F  distribution.   When  the  true  model 
contains  terms  in  addition  to  those  in  the  fitted  model  then 
F  will  have  a  noncentral  F  distribution.   We  write  these  two 
cases  as 


F  ~  F 

k,v 


under  model  (3.1),  and 


K,v  ;Xi 


under  model  (3.2),  where  the  noncentral ity  parameter  is 


Xj_  =  S^(X*   -  X*A)'Vq-'-(X*  -  X*A)l^/2a     . 


3.2.2   The  Testing  Procedure  and  an  Expression  for  the  Power 
of  the  Test 


Given  that  the  form  of  the  fitted  model  is  defined  as 
Eq.  (3.1),  the  expected  value  of  the  numerator  of  the  F 
statistic  in  Eq.  (3.3)  will  depend  on  the  form  of  the  true 
model.   For  the  case  where  the  true  model  is  expressed  as 
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Eq .  (3.2), 


E(numerator)  =  E  (d  '  V~"'"d/k  ) 


(aVk)Ex-^,^ 


(a^/k)(k  +  2X^) 


2      2 
=  o^  +  2a^Xj^/k 


=  a^  +  e^A^B^A,         (3.4) 


where  A^  =  (X*  -  X*A) 'V~  (X*  -  X*A).   However,  when  the  true 

model  is  Eq.  (3.1),  3   =  0  and  in  this  case  X   =  0  so  that 

2         *  2 
E (numerator)  =  a    .      Also  a      .  is  an  unbiased  estimator  of 

ext 

a  2  and 


E(a2^^)  =  a\  (3.5) 


Therefore  the  ratio  E(numerator )/E(denominator )  where 

"  2 
the  denominator  is  a         ,  will  equal  unity  under  model  (3.1), 

that  is,  when  there  is  no  lack  of  fit.   Under  model  (3.2), 

the  ratio  will  be  greater  than  or  equal  to  unity  so  lack  of 

fit  should  be  suspected  if  the  calculated  F  ratio  in 

equation  (3.3)  is  large.   We  can  thus  use  an  upper  tailed 

rejection  region  to  reject  the  hypothesis  of  zero  lack  of 

fit.   The  power  of  the  test  is 
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^(^i,v;X.  >  ^a;k.vl 


where  F      is  the  upper  100a  percentage  point  of  the 

ex  ;  K  /  V 

central  F  distribution  with  k  numerator  degrees  of  freedom 
and  V  denominator  degrees  of  freedom. 

It  is  worth  noting  that  from  Eq.  (3.4)  and  Eq.  (3.5) 
testing  the  hypothesis  that  S^  =  0  is  equivalent  to  testing 
the  hypothesis  that  X-^   =    0,    assuming  A,  is  positive 
definite.   Thus  testing  a  null  hypothesis  of  zero  lack  of 
fit  using  the  proposed  testing  procedure  involving  the  F 
ratio  in  (3.3)  may  be  expressed  as  a  test  of  the  hypotheses 


Hq:   X^  =  0 


H  :   A,  >  0. 
a    X 


3.2.3   A  Method  for  Locating  Optimal  Check  Points 

Once  a  design  for  fitting  model  (3.1)  in  a  mixture 
space  is  chosen  and  the  number  of  simultaneous  check  points 
is  decided  on,  say  k  >  1,  the  next  step  is  to  determine 
where  in  the  mixture  space  we  should  place  the  k  check 
points  so  as  to  maximize  the  power  of  the  test  for  lack  of 
fit.   The  location  of  the  check  points  is  to  be  made 
independently  of  the  value  of  8  . 
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The  power  of  the  upper  tailed  F  test  for  lack  of  fit  is 
an  increasing  function  of  X ,  (see  Appendix  1  for  proof,  with 
X      =    0).   Therefore,  to  maximize  the  power  of  the  test  we 


maximize  the  value  of  X   defined  as 


Xj_  =  &^A^Q^/2a 

where  A^^  =  (X*  -  X*A) 'V~  (X*  -  X*A),  by  properly  selecting 
the  k  check  points  whose  coordinates  are  defined  in  X*.   To 
maximize  the  value  of  Xj^,  we  shall  concentrate  on  the  matrix 
A^. 

The  matrix  A-^    is  a  square  matrix  of  order  P^^P^  and  is 
a  scalar  quantity  when  P2  =  1.   By  maximizing  the  scalar 
quantity  Aj^  with  respect  to  the  k  check  points,  the  power  is 
maximized  no  matter  what  the  value  of  8  .   Maximizing  the 
scalar  Aj^  can  be  accomplished  by  using  The  Controlled  Random 
Search  Procedure  given  by  Price  (1977).   This  procedure  is 
described  in  Appendix  2.   As  a  computational  aid,  Aj^  can  be 
expressed  as 


V   +  (X*  -  X*A) (X*  -  X*A) '  I 


when  P2  =  1»  where  the  symbol  |B|  denotes  the  determinant  of 
the  square  matrix  B.   Thus  the  computations  reduce  to 
evaluating  two  determinants  rather  than  inverting  Vq  (see 
Scheffe,  1959,  Appendix  V,  p. 417). 
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When  p-  >  1  and  A-,  is  no  longer  a  scalar,  maximizing  X -j^ 
(and  thus  maximizing  the  power  of  the  test)  cannot  be 
accomplished  without  specifying  B  .   In  this  case  we  make 
use  of  a  lower  bound  for  X^    (Graybill,  1969,  p. 330,  Theorem 
12.2.14(9) )  defined  as 


2 
min-2-2^        1 


(where  u  .   is  the  smallest  eigenvalue  of  A-,)  to  be  used  in 

place  of  Xi.   Hence  an  approximate  solution  to  the 

maximization  of  X-^  will  be  achieved  by  finding  the  k 

simultaneous  check  points  (using  Price's  procedure)  that 

maximize  u  •  ,  the  smallest  eigenvalue  of  A-i  .   In  other 
mm  -^ 

words  when  p   >  1,  and  in  order  to  avoid  specifying  ^    ,    we 
2  ^ 

seek  to  maximize  a  lower  bound  value  for  X^^.   This 
maximization  does  not  depend  on  the  value  of  Q^. 

There  are  cases  where  the  matrix  Aj^  ^^  °^  less  than 
full  rank  (less  than  rank  P2)  or  equivalently  where  the 
matrix  Ai  is  positive  semi-definite  so  that  u^^^  will  be 
equal  to  zero  no  matter  which  check  points  are  selected. 
One  such  case  occurs  when  k  <  p  (when  the  number  of  check 
points  is  less  than  the  number  of  parameters  in  the  true 
model  which  are  not  in  the  fitted  model)  since  when  k  <  p^ 


rank(Aj^)  =  rank[VQ  Ax*  -  X*A)] 


=  rank(X*  -  X*A) , 
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and  so  rank(A-]^)  <  min(k,  P2)  because  the  matrix  (X*  -  X*A) 

is  of  order  kxp2.   Therefore  when  k  <  p   ,    the  rank  of  A,  is 

at  most  k  so  that  A,  is  of  less  than  full  rank.   Since  u  . 

J-  min 

must  be  equal  to  zero  when  A-,  is  positive  semi-definite,  an 

alternative  method  to  that  of  maximizing  y     to  select 

^   mm 

optimal  check  points  must  be  found  when  A,  is  positive  semi- 
definite  in  order  to  produce  a  positive  lower  bound  for  Xi. 
In  this  pursuit,  let  us  write  X-^   as 


^1  =  i2^ig.2/2a^ 


6_^PAP'3  2/2a^ 


Q^^lP^zP^]    diag[Aj^,  K2  =  0]  [P ^iP^]  'Q_^/2a 


e^PlAlPU2/2c>^ 


where  A  is  a  diagonal  matrix  with  elements  equal  to  the 
eigenvalues  of  An,  P  is  an  orthogonal  matrix  whose  columns 
are  orthonormal  eigenvectors  of  h-^,    Aj^  and  P^^  correspond  to 
the  positive  eigenvalues  of  Aj^,  while  A2  =  0  ^rid  P2 
correspond  to  zero  eigenvalues  of  A^.   Then  by  Theorem 
12.2.14(9)  in  Graybill  (1969)  we  can  write 


y"^.  z'z/2a^  <  X,  (3.7) 

mm-  -'  1 


whe 


re  y  .   is  the  smallest  positive  eigenvalue  of  Ai ,  and 
^min  iT  3  J. 
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z  =  f{^2*   Thus  by  Eq .  (3.7),  an  approach  to  maximizing  a 
positive  lower  bound  for  X -^   when  Aj^  is  positive  semi- 
definite  is  to  select  check  points  that  maximize  the 
smallest  positive  eigenvalue  of  A-^.       It  must  be  noted, 
however,  that  this  method  can  only  be  used  when 
02  E  n  C(P^),  where  C(Pj^)  denotes  the  column  space  of  Pj^ 
and  n  C(P-|^)  denotes  the  intersection  of  all  such  spaces 
which  can  be  obtained  at  all  possible  check  points 
locations.   This  is  because,  in  general,  z'z  in  (3.7) 
depends  on  the  location  of  the  check  points  through  its 
dependency  on  Pj_.   if,  however,  0^  ^  nC(P  ),  then 

z'z  =  ep^P]^02  "  §-2^P'§.2  "  ^2-2'  ^^"^^  ^2^2  "  °* 
It  follows  that  when  ^^   e   n    C(P,),  u^-  z'z/2o^ 

=  *^inin-2-2^^°   ^"^  only  u^^^   depends  on  the  location  of  the 
check  points. 

3.3   Testing  for  Lack  of  Fit  When  MSE  Is  Used 
to  Estimate  Experimental  Error  Variation 

3.3.1   The  Test  Statistic 


In  this  section  we  shall  show  that  when  an  external 
estimate  of  o^  is  not  available  and  the  residual  mean  square 
(MSE)  from  the  fitted  model  of  the  form  (3.1)  must  be  used 
as  an  estimate  of  a^,  the  test  statistic 


d'V^-'-d/k 
^  =  -^llSE^ (3-8) 


possesses  a  central  F  distribution  when  the  true  model  is 
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Eq.  (3.1),  but  possesses  a  doubly  noncentral  F  distribution 
when  the  true  model  is  Eq.  (3.2). 

In  the  initial  section  of  this  chapter,  the  quantity 

-1    2 
d'V-  d/a   was  said  to  possess  a  central  chi-square 

distribution  or  to  possess  a  noncentral  chi-square 

distribution,  depending  on  whether  the  true  model  was 

specified  by  Eq.  (3.1)  or  Eq.  (3.2).   Now,  the  residual  sum 

of  squares  from  the  fitted  model  is  defined  as 


'^        -       2 
SSE  =   E  (Y.  -  Y. ) 

i=l   ^     ^ 


=  Y'  (Ij^  -  X(X'X)  ■'•X'  )Y 


and  it  is  easy  to  show  (Searle,  1971,  p.  57,  Theorem  2)  that 
SSE/a^  possesses  a  central  chi-square  distribution  if  the 
true  model  is  Eq.  (3.1),  but  under  model  (3.2),  SSE/a^ 
possesses  a  noncentral  chi-square  distribution.   This  is 
expressed  as 


SSE/a^  ~  Xn_p 


under  model  (3.1),  and 


SSE/a^  ~  x^fp^x. 
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under  model  (3.2),  where  the  noncentrality  parameter  X 2  is 


^2  "  ^2^^2  "  ^^)'(^2  "  ^A)§.2/2a^ 


The  distributional  form  of  the  test  statistic  in  Eq. 

(3.8)  is  derived  by  knowing  that  the  quantities 

-12  2 

d'Vp,  d/a   and  SSE/a   are  statistically  independent  (see 

Appendix  3),  so  that 


F  = 


d'VQ^d/ka^ 
MSE/a^ 


d'V~''"d/k 
MSE 


is  distributed  as  a  central  F  when  the  true  model  is  Eq. 
(3.1),  but  when  the  true  model  is  Eq.  (3.2)  the  F  ratio  is  a 
doubly  noncentral  F,  that  is,  under  model  (3.1), 


F  ~  F 

k,N-p 


and  under  model  (3.2), 


k,N-p;Xi  ,A2  * 


3.3.2   The  Rejection  Region  and  its  Relation  to  the  Power  of 
the  Test 

In  Appendix  1  it  is  shown  that  if  k,  N-p,  and  X2  are 

fixed,  then  the  power  of  the  F  test  using  the  ratio  (3.8)  is 
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a  function  of  the  location  of  the  rejection  region  (upper 
tailed  or  lower  tailed)  of  the  test.   The  power  increases 
with  increasing  values  of  the  numerator  noncentrality 
parameter,  Xi,  when  the  test  is  an  upper  tailed  test.   The 
power  decreases  with  increasing  values  ot    X ^   when  the  test 
is  a  lower  tailed  test.   This  means  that  to  study  ways  of 
increasing  the  power  of  the  test,  we  have  to  determine 
whether  the  test  is  an  upper  tailed  test  or  a  lower  tailed 
test.   Similarly,  for  fixed  values  of  k,  N-p,  and  X-^,    the 
power  of  the  F  test  is  a  decreasing  function  of  X2  ^^^   ^" 
upper  tailed  test,  and  is  an  increasing  function  of  X 2  "hen 
the  F  test  is  a  lower  tailed  test  (Scheffe,  1959,  p.  136- 
137)  . 

To  decide  if  the  test  is  an  upper  tailed  test  or  a 
lower  tailed  test,  we  recall  from  Section  3.2.2  that  if  the 
true  model  is  Eq.  (3.1)  then  the  expected  value  of  the 
numerator  of  the  F  statistic  in  (3.8)  can  be  written  as 


E( numerator)  =  0^, 


and  if  the  true  model  is  Eq.  (3.2), 


2     2 
E(  numerator)  =  a   +  2a  X  ^/'k.  (3.9) 


=  0^  +  3^A^B2A 
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where  the  P2XP2  matrix  A^  is  A^^  =  (X*  -  X*A)'Vq  (X*  -  X*A) 
Similarly,  it  can  be  shown  that  if  the  true  model  is  Eq. 
(3.1),  the  expected  value  of  the  denominator  of  the  F 
statistic  in  (3.8),  where  the  denominator  equals  MSE,  is 

E(denominator)  =  E(MSE) 


=  a2, 


but    if    the    true   model    is    Eq.(3.2), 


E (denominator)    =   E(MSE) 


[c^'/(N    -    P)]Ex'fp,,^ 


[a^/(N    -    p)][N    -    p    +    2X2] 


a^    +    2a^\^/{n   -   P)  (3.10) 


0^  +  e2A2e2/(^  "  P) 


where  the  P2'<P2  matrix  A2  is  A2  =  (X2  -  XA)'(X2  -  XA)  .   Thus 
the  ratio  E(numerator )/E (denominator )  will  equal  unity  if 
the  true  model  is  Eq.  (3.1),  but  if  the  true  model  is  Eq. 
(3.2),  the  ratio  is  greater  than  unity  if   §.2^i3  2/^^  > 
3 ' A  e  /(N  -  p) .   In  this  latter  case  we  reject  the  null 
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hypothesis  of  zero  lack  of  fit  if  the  calculated  value  of 
the  F  ratio  in  (3.8)  is  large.   An  upper  tailed  rejection 
region  seems  reasonable  for  this  test.   When  the  true  model 
is  Eq.  (3.2),  and  if  g^Aj^g^A  <  l^P^2^2'^''^   ~   P^'  ^^en  a  lower 
tailed  rejection  region  is  preferred. 
3.3.3.   A  Method  for  Locating  Optimal  Check  Points 

Given  a  design  for  fitting  a  model  of  the  form  in  Eq. 
(3.1)  in  a  mixture  space  (note  that  fixing  the  design  fixes 
^2   and  (N  -  p)),  and  given  the  number  of  simultaneous  check 
points  desired,  k  >  1,  we  now  wish  to  determine  where  in  the 
mixture  space  the  k  check  points  should  be  located  so  as  to 
maximize  the  power  of  the  F  test  for  lack  of  fit,  where  the 
test  statistic  is  given  in  Eq.  (3.8).   We  also  wish  to 
position  the  optimal  check  points  in  a  manner  that  is 
independent  of  the  values  of  the  elements  in  g  . 

The  case  of  an  upper  tailed  test.   To  help  us  find  k 
simultaneous  check  points  that  maximize  the  power  of  an 
upper  tailed  test,  we  shall  make  use  of  the  fact  that  the 
power  is  an  increasing  function  of  X-,.      Therefore  to 
maximize  the  power  of  the  upper  tailed  F  test,  we  shall  seek 
the  locations  of  the  k  check  points  that  maximize  X,. 

As  in  the  case  considered  in  Section  3.2.3,  where  the 
test  statistic  had  a  noncentral  F  distribution,  if  the 
number  of  extra  terms  in  the  true  model  is  P2  =  1,  then 
maximizing  X -j^  is  equivalent  to  maximizing  the  scalar  A,. 
However,  as  before,  if  p   >  l,  then  the  P^^Po  "^^^rix  A-^    is 
not  a  scalar  and  we  will  have  to  approximate  the 
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maximization  of  X -^   by  maximizing  a  lower  bound  for  X-^.      This 
is  done  by  finding  the  maximum  value  of  u  .  ,  the  smallest 
eigenvalue  of  A-]^,  since 


2 

^min-2-2'        1 


When  the  number  of  check  points  is  less  than  the  order 

of  the  square  matrix  A-^,  that  is,  k  <  P2f  then  rank(A]^)  < 

min(k,  po),  and  A-,    will  have  y  .   =  0.   For  this  case,  we 

again  try  to  maximize  the  smallest  positive  eigenvalue  of  h-^ 

which  we  denote  by  u'*'.  ,  while  remembering  from  Section 

min 

3.2.3  that  this  technique  is  limited  to  situations  where 
B^  e  nC(P^)  . 

The  case  of  a  lower  tailed  test.   To  find  k  check 
points  to  maximize  the  power  of  a  lower  tailed  test,  we  make 
use  of  the  fact  that  the  power  of  the  lower  tailed  F  test 
increases  as  X  j^  decreases.   Then  if  P2  =  1  and  A-|^  is  a 
scalar  quantity,  X -^   can  be  minimized  with  respect  to  the  k 
check  points  by  finding  the  check  points  that  minimize  A-j^. 
If  p   >  1,  then  by  Theorem  12.2.14(9)  in  Graybill  (1969),  we 
see  that  an  upper  bound  for  Xj^  is 


^1  <  ^max^2^2/2<^''  ^''^^^ 


where  u     is  the  largest  eigenvalue  of  A-i  .   An  approximate 
^max  ^3      ^  i 

solution  to  minimizing  X  j^  in  (3.11)  can  be  achieved  by 

minimizing  u    .   It  is  not  necessary  to  treat  the  case 
^   max 
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of  k  <  p   separately  here,  although  X ^  will  equal  zero  if 

g_2  is  in  the  column  space  of  P2 ,  where  P2  is  the  matrix 

whose  columns  are  orthonormalized  eigenvectors  corresponding 

to  the  zero  eigenvalues  of  the  matrix  A, . 

3.3.4   Determining  Whether  the  Test  Is  Upper  Tailed  or  Lower 
Tailed 

The  procedures  outlined  in  Section  3.3.3  produce  a  set 
of  k  check  points  that  simultaneously  maximize  the  power  of 
the  upper  tailed  test  as  well  as  a  second  set  of  k  check 
points  that  simultaneously  maximize  the  power  of  the  lower 
tailed  test.   The  check  points  that  are  selected  maximize 
the  power,  given  A2f  k,  and  N  -  p  without  specification 
of  g_2'  except  that  when  Aj^  is  positive  semi-definite  we 
require  that  0   e  n  C(P  ). 

It  is  now  necessary  to  decide  which  of  our  two 
candidates  will  be  used  for  a  lack  of  fit  test.   To  choose 
between  the  upper  tailed  test  and  the  lower  tailed  test,  let 
us  consider  the  quantity 


R  =  [A^/k]  -  [A^/(N    -   p)] . 


If  R  is  positive  definite  when  the  true  model  is  Eq.  (3.2), 
then  no  matter  what  the  value  of  0   is,  the  ratio 
E (numerator )/E (denominator)  will  be  greater  than  unity, 
implying  an  upper  tailed  test  is  to  be  used.   Similarly,  if 
R  is  negative  definite,  then  a  lower  tailed  test  should  be 
used.   Finally,  if  R  is  not  definite,  then  neither  an  upper 
nor  a  lower  tailed  test  is  implicated  and  further 
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investigation  is  necessary.   The  criterion  of  R  =  [A  /k]  - 
[a  /(N  -  p)]  may  yield  any  of  the  four  following  cases. 

Case  1.   If  R  =  [Aj^/k]  -  [a^/CN  -  p)]  is  positive 
definite  when  A-^    is  generated  by  the  k  optimal  upper  tailed 
test  check  points,  and  R  is  not  negative  definite  when  A-,  is 
generated  by  the  k  optimal  lower  tailed  test  check  points, 
then  we  recommend  that  the  check  points  be  used  that  yield 
the  optimal  upper  tailed  test  with  an  upper  tailed  rejection 
region. 

For  Case  1  it  is  necessary  for  A,  to  be  positive 
definite  (see  Appendix  4).   Since  A-,  is  a  square  matrix  of 
order  P2^P2  with  rank(A  )  <  min(k,  P2)'  then  A^  can  be 
positive  definite  only  if  k  >  p  .   Thus,  there  must  be  at 
least  P2  check  points  for  Case  1  to  hold,  where  P2  is  the 
number  of  terms  in  the  model  of  Eq.  (3.2)  that  are  not  in 
the  model  of  Eq.  (3.1). 

From  inspection  of  equations  (3.9)  and  (3.10),  it  is 
apparent  that  the  testing  for  lack  of  fit  in  Case  1  is 
equivalent  to  testing  the  hypothesis 


^1     ^2 
^0''      — F-  -  N-^  =  0  (3.12) 


against  the  alternative 


«a=   -X--N-H  >  0 
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since  R  =  [  A^^/k]  -  [A2/(N-p)]  is  positive  definite  when  the 
true  model  is  Eq.  (3.2).   In  i^pendix  5(a)  it  is  shown  that 
under  Case  1,  the  hypothesis  given  by  (3.12)  is  equivalent 
to  the  hypothesis 


Hq:   X^  =  X2  =  0. 


Case  2.   In  Case  2  we  assume  that  R  =  [A  /k]  - 
[a  /(N  -  p)]  is  not  positive  definite  for  the  k  optimal 
upper  tailed  test  check  points,  but  that  R  is  negative 
definite  for  the  k  optimal  lower  tailed  test  check  points. 
Here  we  recommend  that  the  lower  tailed  test  check  points  be 
used  with  a  lower  tailed  rejection  region. 

It  is  necessary  for  A2  to  be  positive  definite  for  Case 
2  to  occur  (see  Appendix  4).   However,  k-^   need  not  be 
positive  definite,  and  so  k  need  not  be  greater  than  P2.   In 
Case  2  then,  it  is  possible  that  lack  of  fit  may  be  tested 
with  only  one  check  point. 

By  inspection  of  equations  (3.9)  and  (3.10),  a 
hypothesis  of  no  lack  of  fit  is  equivalent  to 


X  ,        X  2 


while  the  alternative  hypothesis  that  lack  of  fit  is  present 
is  equivalent  to 


X  X 

a     k       N  -  p 
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since  R  =  [  A^^/k]  -  [a^/CN  -  p)]  is  negative  definite.   In 
Appendix  5(b)  it  is  shown  that  the  hypothesis  given  by 
(3.13)  is  equivalent  to  the  hypothesis 


Hq:   A^  =  X^  =  0. 


Case  3.   We  assume  R  is  positive  definite  for  the  k 
optimal  upper  tailed  test  check  points,  and  R  is  negative 
definite  for  the  k  optimal  lower  tailed  test  check  points. 
Hence  either  an  upper  or  lower  tailed  test  may  be  considered 

as  a  possible  test  for  lack  of  fit.   If  the  quantity 

2 

-2-2^"   can  be  specified,  then  the  minimum  power  for  both 

the  optimal  upper  and  optimal  lower  tailed  tests  can  be 
approximated,  and  the  test  with  the  greater  minimum  power  is 
recommended.   In  Appendix  4  it  is  shown  that  Case  3  can 
occur  only  when  A-j^  is  positive  definite  for  the  upper  tailed 
test.   Thus  Case  3  can  only  occur  when  there  are  at  least  pn 
check  points. 

The  minimum  power  of  the  upper  tailed  test  may  be  found 
by  calculating 


"^  (^J,N-p,X^^,X,„  >  ^«;k,N-pl'  '^•"' 


where  F 

central  F  distribution. 


a-k  N-p  """^  ^^^  upper  100a  percentage  point  of  the 
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^IL  =  ^min^2&2/2^^' 


and 


^2U  =  ^ax^2^2/2<^^' 


where  \i     .       is  the  smallest  eigenvalue  of  An  and  5     is  the 
mm  ^  1      max 

largest  eigenvalue  of  A2.   Formula  (3.14)  yields  a 
conservative  lower  bound  for  the  power  of  the  optimal  upper 
tailed  test.   Note  that  Aj^  is  generated  using  the  optimal 
upper  tailed  test  check  points.   The  cumulative  distribution 
function  of  F"  can  be  approximated  by  multiplying  the 
cumulative  probabilities  of  the  central  F  distribution  by  a 
constant  (Johnson  and  Kotz,  1970,  p. 197).   This 
approximation  is  described  in  Appendix  6.   Other 
approximations  for  F"  (such  as  the  Edgeworth  series 
approximation  suggested  by  Mudholkar,  Chaubey,  and  Lin, 
1976)  exist  which  are  generally  more  accurate,  but  we  chose 
to  use  the  approximation  given  in  Johnson  and  Kotz  (1970, 
p. 197)  due  to  its  simplicity.   Additionally,  the 
approximation  of  Mudholkar,  Chaubey,  and  Lin  (1976)  produced 
negative  probabilties  when  only  one  degree  of  freedom  was 
available  in  either  the  numerator  or  denominator  of  F" . 
This  problem  was  avoided  by  using  the  approximation  given  by 
Johnson  and  Kotz  (1970). 

The  minimum  power  of  the  optimal  lower  tailed  test  can 

2 
be  approximated  similarly  (if  SAio/''  ^^    specified)  by 
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calculating 


P  I  F"  <  F  1 

^  k,N-p;X^y,X2L     (l-ct);k,N-p^ 


where 


^lU  =  ^max^2^2/2o^ 


and 


^2L  =  Vin^2&2/2a^ 


with  u„a„  equal  to  the  largest  eigenvalue  of  Ai  and  6  . 
max  3      :3  j_      j^j^j^ 

equal  to  the  smallest  eigenvalue  of  A2.   Note  that  A-j^  is 
generated  by   using  the  optimal  lower  tailed  test  check 
points.   For  the  lower  tailed  test,  A-j^  may  be  positive  semi- 
definite,  and  if  3   is  in  the  column  space  of  P2  then  X-,  =  0. 
In  Case  3,  the  upper  tailed  test  is  a  test  of 


Hq:   Xi  =  A^  =  0 


X        X 


while  the  lower  tailed  test  is  a  test  of 
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%''      ^=^2    =    ' 


X  X 


Case  4 .   In  Case  4  we  assume  that  R  =  [ A  /k]  - 
[k^/(U   -  p)]  is  not  positive  definite  for  the  k  optimal 
upper  tailed  test  check  points  and  R  is  not  negative 
definite  for  the  k  optimal  lower  tailed  test  check  points. 
Here  it  is  useful  to  write  the  difference  between  the 
expected  value  of  the  numerator  and  the  expected  value  of 
the  denominator  of  the  F  ratio  in  (3.8)  as 

s^[Aj^/k  -  A^/CN  -  p)]32  =  e^sns'e^ 


=  3'S^.^S'3  2  +  3^33.33.32 


where  ^    =  diag(n-,,  ^2'    ^3)  is  a  diagonal  matrix  consisting 
of  the  eigenvalues  of  R,  J^j^  is  a  diagonal  matrix  of  the 
positive  eigenvalues  of  R,  ^2    is  a  diagonal  matrix  of  the 
zero  eigenvalues  of  R,  and  ^^3  is  a  diagonal  matrix  of  the 
negative  eigenvalues  of  R.   The  orthogonal  matrix  S  can  be 
expressed  as  S  =  [Sj^  :S2:S3]  ,  where  the  matrices  S^^,  S2f  and 
S3  have  columns  which  are  orthonormalized  eigenvectors 
corresponding  to  Q-^,    Q.2,    and  .3,  respectively. 
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In  Case  4,  neither  the  optimal  upper  tailed  test  nor 
the  optimal  lower  tailed  test  is  applicable  for  all  values 
of  e  .   For  completeness,  we  note  that  Case  4  actually 
consists  of  nine  subcases,  where  R  may  be  positive  semi- 
definite,  negative  semi-definite,  or  indefinite  for  either 
the  optimal  upper  tailed  test  or  lower  tailed  test  check 
points.   These  subcases  are  listed  in  Table  2. 


Table  2.   Nine  Subcases  of  Case  4. 


R- 

— Upper 

R- 

—Lower 

Subcase 

Tai 

led  Test 

Tai 

led  Test 

1 

PSD 

PSD 

2 

PSD 

NSD 

3 

PSD 

I 

4 

NSD 

PSD 

5 

NSD 

NSD 

6 

NSD 

I 

7 

I 

PSD 

8 

I 

NSD 

9 

I 

I 

PSD  =  positive  semi-definite,  NSD  =  negative  semi- 
definite,  I  =  indefinite. 


If  g-  lies  in  the  column  space  of  S2f  then  3'[a  /k  - 
A2/(N  -  P)]B2  is  zero,  and  therefore  lack  of  fit  is  not 
testable  with  either  an  upper  or  lower  tailed  test.   A 
sufficient  condition  for  the  test  for  lack  of  fit  to  be 
upper  tailed  in  Case  4  is  that  B   be  in  the  column  space 

of  [^1*^2-1'  ^^^   "^^^  entirely  in  the  column  space  of  S2.   In 
this  case 
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e^[A^/k    -    A^/(N    -   p)]g2    =    e22i"iSiB2    +    §.2^3^3231- 


=    §.2Sii2j^S|e2    +    0 


-  §.2^i"iSie2f 


and  3  2[  A^/k  -  A2/(N  -  P)]e2  ^^^^  '^^  greater  than  zero, 
indicating  an  upper  tailed  test.   Similarly,  a  sufficient 
condition  for  the  test  for  lack  of  fit  to  be  lower  tailed  is 
that  3   be  in  the  column  space  of  [S  :S  ],  but  not  entirely 
in  the  column  space  of  S2.   Then 

3^[A^/k  -  A^/CN  -  p)]32  =  0  +  3^32^33^3.2 


-  3^33^33^3.2 


which  makes  g.^[  A^^/k  -  A2/(N  -  p)]g.2  less  than  zero, 
indicating  a  lower  tailed  test. 

To  determine  whether  3-,  is  in  the  column  space  of 
[3  :3  ],  let  us  define  the  augmented  matrix 

^1  ~  [§.9*^i*3t]  •   If  ^1^1  ^^^    ^  zero  eigenvalue,  then  3„  is 
in  the  column  space  of  [3  :S  ].   Similarly,  if  we  define 

^2  ~  [^2*^2-1  ^"*^  ^1,    ~    [§.9 '^9  •^■^]  '  then  3   is  in  the  column 
space  of  S-p  if  Q'Q„  has  a  zero  eigenvalue,  and  3^  is  in  the 
column  space  of  [ 3  :3  ]  if  QAQ,  has  a  zero  eigenvalue. 
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Given  that  we  are  in  a  particular  subcase  of  the  nine 
subcases  described  in  Table  2,  we  recommend  that  lack  of  fit 
be  tested  with  the  upper  tailed  test  check  points  if  it  is 
determined  that  0   is  such  that  e^[A^/k  -  A^/CN  -P)]e2  ^^ 
positive  when  A-,  is  generated  from  the  upper  tailed  test 
check  points.   Likewise,  for  the  same  given  subcase,  if  the 
value  of  6__  of  interest  is  determined  to  produce  a  negative 
value  for  e.A[A,/k  -  A^/(N  -  P)]32  ^'hen  A-^    is  generated  from 
the  lower  tailed  test  check  points,  then  we  recommend  that 
lack  of  fit  be  tested  with  the  lower  tailed  test. 

We  see  then  that  Case  4  is  an  undesirable  situation  in 
practice,  since,  in  order  to  test  for  lack  of  fit,  we  must 
assume  a  priori  that  any  lack  of  fit  is  due  to  a  nonzero 
value  of  B „  that  produces  an  upper  tailed  or  lower  tailed 
rejection  region.   However,  it  would  seem  rare  that  such 
knowledge  would  be  available. 

3.4   Examples 

We  now  present  several  examples  to  illustrate  the 
technique  for  locating  optimal  check  points  to  be  used  in 
testing  for  lack  of  fit  in  a  mixture  model. 
3.4.1   Theoretical  Examples 

Example  1.   In  this  example  a  second  order  canonical 
polynomial  model  is  fitted  in  three  mixture  components  using 
the  {3,2}  simplex  lattice  design,  which  is  presented  in 
Figure  1  of  Chapter  1.   The  true  model  is  assumed  to  be  the 
special  cubic  model  containing  the  term  3    x  x  x   in 
addition  to  the  six  terms  of  the  fitted  model.   The  expected 
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values  of  the  response  at  the  six  design  points  are  assumed 
to  be  represented  by  the  fitted  model  in  the  form 


E(Y)  =  X6^, 


but  with  the  true  model  the  expectations  are  written  as 


E(Y)  =  X3j^  +  y^^t^, 


where  X  is  a  6x6  matrix  with  rows  that  define  the 
coordinates  of  the  six  design  points  and  columns  that 
correspond  to  the  six  terms  in  the  fitted  model  (xj^,  X2f  X3, 
x,x-,  ^I'^T'  x„x^),  B-,  is  the  6x1  vector  of  regression 
coefficients  (g^,  <i>  ^,    g^,  g^^'  ^  j^^ '  ^23^'  ^2  ^^  ^  ^"^^ 
column  vector  containing  the  values  of  the  term  '^\'^2^1   ^^ 
the  design  points,  and  3   is  the  single  regression 
coefficient  3i^o' 

The  {3,2}  simplex  lattice  design  consists  of  only  six 
design  points,  and  since  six  parameters  are  estimated  in  the 
second  order  fitted  model,  there  are  no  degrees  of  freedom 
remaining  for  obtaining  an  estimate  of  the  experimental 
error,  a^.   We  assume  therefore  that  an  external  estimate  of 

o''  is  available,  a  ^t    which  will  be  used  in  the  denominator 

ext 

of  the  lack  of  fit  F  statistic  given  in  Eq.  (3.3). 

Since  there  is  one  term  in  the  true  model  in  addition 
to  those  in  the  fitted  model,  that  is  P2  =  1,  we  know  that 
in  order  to  locate  k  simultaneous  check  points  that  maximize 


69 


the  power  of  the  test  for  lack  of  fit  it  is  necessary  to 
maximize  the  scalar  quantity 


A^    =    (X*  -  X*A)  •Vq-'-(X*  -  X*A) 


with  respect  to  the  coordinates  of  the  k  check  points.   Here 

X*  is  a  k-element  column  vector  with  ith  element  equal  to 

the  value  of  x*  x*  x*   at  the  ith  check  point,  X*  is  a  kx  6 

matrix  with  ith  row  equal  to  the  value  of  (x*  ,  x*  ,  x*  , 

il    i2    i3 

X*  x*  ,  x*  x*  ,  x*  X*  )  at  the  ith  check  point, 

A  =  (X'X)"lx'X2  is  the  6x1  alias  vector,  and 

V   =  I   +  X*(X'X)~  X*'.   This  maximization  is  accomplished 

by  use  of  the  Controlled  Random  Search  Procedure  (Price, 

1977),  which  is  described  in  Appendix  2. 

When  only  a  single  (k  =  1)  optimal  check  point  is 
desired  the  Controlled  Random  Search  Procedure  locates  a 
point  (x*,  X*)  which  maximizes 


A^  =  (X*  -  X*A)'Vq-'-(X*  -X*A), 


where 


X*    =   xjx*x*  =  x*x*(l  -  X*  -  X*), 
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X*  =  (Xj,  X*,  X*,  x*x*,  x*x*,  x*x*) 


=  (X*,  X*,  (1  -  xj  -  X*),  xjx*,  xj(l  -  xj  -  X*), 


X*(l  -  Xj  -  X*)), 


and  Vq  =  1  +  X*(X'X)~  X*'.   The  value  of  A^^  is  calculated 
using  the  formula  of  Eq.  (3.6).   Following  this  procedure, 
we  find  that  the  single  check  point  that  maximizes  A-j^,  and 
thus  maximizes  the  power  of  the  test,  is  the  centroid  of  the 
triangular  factor  space  (1/3,  1/3,  1/3).   The  value  of  A-^   at 
this  centroid  point  is  A-j^  =  0.00084. 

When  the  Controlled  Random  Search  Procedure  is  used  to 
locate  k  =  2  simultaneous  check  points  that  maximize  A-j^,  the 
centroid  (1/3,  1/3,  1/3)  is  selected  twice,  and  A-j^  = 
0.00121.   For  three  simultaneous  optimal  check  points,  the 
centroid  is  selected  three  times,  and  A,  =  0.00142. 

To  test  whether  the  second  order  model  exhibits  lack  of 
fit,  when  we  suspect  the  special  cubic  model  is  the  true 
model,  we  form  the  F  ratio 


d'V^-'-d/k 
F  =  —   " 


'2 
''ext 


with  the  single  check  point  (1/3,  1/3,  1/3)  where  d  = 
Y*  -  Y*,  Y*  is  the  observed  response,  Y*  is  the  response 
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predicted  by  the  second  order  fitted  model  at  (1/3,  1/3, 
1/3),  and  Vq  =  1  +  (1/3,  1/3,  1/3,  1/9,  1/9,  1/9)(X'X)"1 
(1/3,  1/3,  1/3,  1/9,  1/9,  1/9)'.   If  the  calculated  value  of 

the  F  ratio  exceeds  F   ,    where  v  equals  the  number  of 

a  ;  1 ,  V 

*  9 

degrees  of  freedom  associated  with  a     then  we  reiect  the 

ext  -" 

null  hypothesis  that  the  second  order  model  is  the  true 

model  in  favor  of  the  alternative  hypothesis  that  the 

special  cubic  model  is  the  true  model.   Equivalently,  we 

reject  H^:   X  j^  =  0  in  favor  of  H  :   A,  >  0.   For  k  =  2  or 

k  =  3  check  points,  the  value  of  the  F  ratio  is  calculated 

using  the  observed  and  predicted  responses  at  the  two  or 

three  replicates  at  the  centroid.   The  hypothesis 

H  :   X,  =  0   is  rejected  in  favor  of  H  :   X,  >  0  if  F 
U     1  a     1 

exceeds  F  , 

a  ;k,v 

Example  2.   In  Example  2  we  illustrate  the  second  of 
the  four  cases  that  could  arise  when  MSE  is  used  as  an 
estimate  of  a^  in  the  lack  of  fit  test  statistic  (see 
Section  3.3.4).   We  again  fit  a  second  order  canonical 
polynomial  model  in  three  mixture  components,  and  assume  the 
true  model  is  special  cubic.   The  design  to  be  used  is  the 
q  =  3  simplex  centroid  design,  which  consists  of  seven 
design  points,  and  is  illustrated  in  Figure  2  of  Chapter  1. 

There  are  six  parameters  to  be  estimated  and  seven 
design  points  hence  one  degree  of  freedom  can  be  used  to 
calculate  MSE.   We  shall  use  MSE  to  estimate  a  2.   Optimal 
upper  and  lower  tailed  test  check  points  must  fc>e  located, 
and  then  a  decision  is  made  as  to  which  test  should 
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be  used.   The  actual  testing  for  lack  of  fit  involves  the  F 
statistic  in  (3.8). 

As  in  Example  1,  P2  =  1,  since  there  is  one  term  in  the 
true  model  in  addition  to  those  in  the  fitted  model.   Thus 
Aj^  is  a  scalar  whose  value  we  seek  to  optimize  with  respect 
to  the  desired  number  of  check  points,  k.   When  only  a 
single  check  point  is  sought  for  the  purpose  of  testing  lack 
of  fit,  the  Controlled  Random  Search  Procedure  has  two 
functions.   First,  the  procedure  is  used  to  locate  the 
optimal  candidate  check  point  for  an  upper  tailed  test  by 
locating  the  check  point  that  maximizes  the  scalar  A-.. 
Secondly,  the  procedure  is  used  to  locate  the  optimal 
candidate  check  point  for  a  lower  tailed  test,  which  is 
accomplished  by  locating  the  point  that  minimizes  A-,.   The 
quantity  R  =  [A-,/k]  -  [A2/(N  -  p)]  is  then  calculated  to 
determine  whether  the  upper  or  lower  tailed  test  will  be 
used.   If  R  is  positive  for  the  candidate  check  point  for  an 
upper  tailed  test,  then  the  test  is  upper  tailed,  and  the 
test  is  lower  tailed  if  the  candidate  check  point  for  a 
lower  tailed  test  produces  a  negative  value  for  R.   Note 
that  A2  =  (X2  -  XA)'(X2  -  XA)  is  fixed  once  the  design  is 
specified,  since  A2  does  not  depend  on  the  check  points. 
Using  the  Controlled  Random  Search  Procedure  it  is  found 
that  the  maximum  value  of  A^^  occurs  at  (xj,  x^,  x^)  =  (1/3, 
1/3,  1/3),  which  will  be  the  location  for  the  check  point 
for  the  upper  tailed  test.   Calculating  Aj^  at  this  centroid 
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point,  we  find  that  R  =  [A^/k]  -  [A2/(N  -  p)]  =  [(3.7258 

X  lO""*)/!]  -  [(8.4175  X  10"'^)/1]  =  -4.6917  x  10"^.   Since  R 

is  negative,  the  test  is  not  upper  tailed. 

Using  the  Controlled  Random  Search  Procedure  to 
minimize  Aj^,  we  find  that  a  subregion  of  the  factor  space 
exists  in  which  all  points  yield  a  near  minimum  value  for 
A-]^.   We  choose  the  point  (0.0189,  0.9269,  0.0542)  at  random 
from  this  subregion  to  be  used  as  the  optimal  candidate  for 
a  lower  tailed  test.   Here  R  =  0  -  [(8.4175  x  10~'^)/1]  = 
-8.4175  X  lO"'^. 

Since  R  is  negative  for  both  the  optimal  upper  tailed 
test  check  point  and  for  the  optimal  lower  tailed  test  check 
point,  we  have  Case  2  of  Section  3.3.4.   The  upper  tailed 
test  check  point  is  disregarded,  and  the  lower  tailed  test 
check  point  (0.0189,  0.9269,  0.0542)  is  used  to  test  for 
lack  of  fit.   If  the  calculated  F  ratio. 


MSE 


is  less  than  F,,   .  ^  ,  then  H„  :   X ,  =  X ^  =  0  is  rejected  in 
( 1  -a  )  ;  1 , 1        0     1     2 

favor  of  H  :   [x  /l]  -  [x  /l]  <  0,  that  is  we  conclude  that 
a      J.        2, 

the  second  order  model  exhibits  lack  of  fit,  and  the  true 
model  is  special  cubic. 

When  two  simultaneous  check  points  are  desired  for 
testing  lack  of  fit,  we  can  again  use  the  Controlled  Random 
Search  Procedure  to  locate  the  optimal  settings.   To 
maximize  the  scalar  An,  we  find  that  both  check  points 
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should  be  selected  at  (1/3,  1/3,  1/3),  for  an  upper  tailed 
test.   With  our  calculations  R  =  [(5.8275  x  10"'*)/2]  - 
[(8.4175  X  10-4)/l]  =  -5.5038  x  lO'^,  but  since  R  is 
negative,  the  test  is  not  upper  tailed. 

Minimizing  A-^   to  locate  two  optimal  lower  tailed  test 
check  points  yields  a  subregion  in  the  factor  space  of 
optimal  check  points.   The  pair  of  check  points  (0.3749, 
0.5752,  0.0499)  and  (0.5332,  0.4169,  0.0499)  is  selected  at 
random  from  this  subregion,  and  these  check  points  yield 
R  =  0  -  [(8.4175  X  10-'^)/l]  =  -8.4175  x  lO"'*. 

Since  R  is  negative  for  the  upper  tailed  test  points 

and  the  lower  tailed  test  points,  we  have  Case  2  of  Section 

3.3.4  again  and  the  lower  tailed  test  check  points  are  used 

to  test  for  lack  of  fit.   The  hypothesis  H  :   X   =  X   =  0  is 

rejected  in  favor  of  H  :   [x  /2]  -  [x  /l]  <  0  if  the  cal- 

a      X        z 

culated  value  of  F  =  (d'V~  d/2)/MSE  is  less  than 

F.^   .  „  , ,  in  which  case  we  say  lack  of  fit  of  the  model  is 
( l-o ) ; 2  , 1 

present. 

*  2 

If  an  external  estimate  a   ^  had  been  available  for 

ext 

this  example,  then  the  optimal  upper  tailed  test  check 

points  could  have  been  used  in  the  F  ratio, 

F  =  (d'V~  d/k)/a^  ^,  and  lack  of  fit  would  then  be  detected 
^-   0  -      "    ext 

if  the  calculated  value  of  F  exceeded  F   , 

a  ;k,  V 

Example  3.   Example  3  illustrates  the  procedure  for 
locating  optimal  check  points  when  there  are  two  terms  in 
the  true  model  in  addition  to  those  in  the  fitted  model.   A 
second  order  canonical  polynomial  model  in  three  mixture 
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components  is  fitted  using  a  q  =  3  simplex  centroid  design. 
The  true  model  is  assumed  to  contain  eight  terms,  six  of 
which  are  the  same  terms  as  in  the  fitted  model,  with  the 
additional  two  terms  being  the  third  order  terms 

*^12^1^2^^1  ~  ^2^  ^^^    ^123^1^2^3*  ^    ^^    Example  2,    there  is 
one  degree  of  freedom  for  MSE  which  is  used  to  estimate 
a2.   The  test  statistic,  F  =  (d '  v'-'-d/k  )/MSE,  is  given  in 
equation  (3.8). 

Since  p2  =  2  and  A-j^  is  a  2^2  matrix,  locating  the 
optimal  upper  tailed  test  check  points  by  the  procedure  of 
maximizing  X  ^^  is  assisted  by  the  maximizing  of  a  lower  bound 

for  A,,  namely  maximizing  u    .    Bl3_/2o^,  where  y  .   is  the 

min—z—z  min 

2 
smallest  eigenvalue  of  A-^.      Since  g-  and  a      are  unknown, 

this  is  equivalent  to  maximizing  u  .  .   For  u     .       to  exceed 

mm        min 

zero,  it  is  necessary  that  A-^    be  of  full  rank,  and  since 

rankCA-j^)  <  min(k,  P2),  it  is  necessary  to  select  k  >  2  check 

points.   If  A^  is  less  than  full  rank,  and  thus  is  positive 

semi-definite,  only  a  subset  of  possible  values  of  3 „  could 

be  considered  to  make  it  possible  to  test  for  lack  of  fit 

with  an  upper  tailed  test. 

Using  the  Controlled  Random  Search  Procedure,  the 

points  that  maximize  w  .   are  found  to  be  (0.418,  0.277, 

0.305)  and  (0.277,  0.418,  0.305).   These  points  are  thus 

optimal  candidates  for  upper  tailed  test  check  points.   At 

these  check  points  we  have  v    .       =  5.1623  x  10"^,  A,  = 

mm  1 

diag[5.1623   x    io-4,    5. 1916    x    lO'^] ,    A2    =    diag[0,     8.4175    x 
10-4],    3f^^    R   ^    [A^/2]     -    [A2/I]     =   diag[2.5811    x    lo'^,    -5.8217 
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X  10"^].   Since  the  eigenvalues  of  R  are  -5.8217  ^    10"^  and 
2.5811  X  10"'^,  R  is  indefinite.   Following  the  suggested 
procedure  for  Case  4  of  Section  3.3.4,  we  note  that  an  upper 
tailed  test  for  lack  of  fit  exists  if  the  value  of 

-2  ~  ^"^12'  ^123-^'  ^^    ■"""  ^^^  column  space  of  [S2:S2]  but  not 
entirely  in  the  column  space  of  S2,  where  S-^    is  the  matrix 
whose  columns  are  the  orthonormalized  eigenvectors  of  R 
corresponding  to  the  positive  eigenvalues  of  R,  and  S2  is 
the  matrix  whose  columns  are  the  orthonormalized 
eigenvectors  of  R  corresponding  to  the  zero  eigenvalues  of 
R.   Since  R  has  no  zero  eigenvalues  in  this  example,  S2  does 
not  exist,  but  S^  is  the  column  vector,  S-^    =  [1,0]'.   Thus 

if  3   is  of  the  form  3   =  [^-,2'    °^  '  '  ^^^^^  "^12  *  °'  ^^^^ 
3^  is  in  the  column  space  of  Sj^  and  the  test  is  upper 
tailed  . 

The  matrix  A2  has  rank  one  and  therefore  is  positive 
semi-definite.   Hence  it  is  impossible  to  locate  two  check 
points  that  minimize  w^^j^  and  also  make  R  =  [h-^/2]    -    [A2/I] 
negative  definite  (see  Appendix  4),  that  is,  it  is 
impossible  to  find  a  lower  tailed  test  that  is  capable  of 
testing  lack  of  fit  for  all  values  of  3  .   However,  if  we 
use  the  Controlled  Random  Search  Procedure  to  locate  two 
check  points  that  minimize  an  upper  bound  for  A   which  is 

y    8  13^/20  ,  then  by  minimizing  m         ,    we  find  that  any  of 
max-2-2^  ^  max 

the  check  points  in  a  particular  subregion  of  the  factor 

space  yield  a  near  minimum  for  u    .   One  pair  of  points  in 
-^  max 

this  subregion  is  selected  as  the  points  to  be  used  as 
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optimal  lower  tailed  test  check  points,  namely  the  pair 

consisting  of  the  point  (0.053,  0,  0.947)  replicated  twice. 

Replicating  this  check  point,  we  find  u     =  7.3900 

max 

X  10"11,  A-L  =  diag[0,  7.3900  x  10"11],  A2  =  diag[0,  8.4175  x 
10""^],  and  R  =  [A-^/2]     -    [A2/I]  =  diag[0,  -8.4175  x  lO"'^]. 
The  eigenvalues  of  R  are  0  and  -8.4175  x  10"'*  implying  that 
R  is  negative  semi-definite.   The  values  of  g   that  are  in 
the  column  space  of  [82:33]  but  not  entirely  in  the  column 
space  of  S2  will  provide  a  lower  tailed  test.   Here,  [82:33] 
=  diag[l,l]  and  S2  =  [1,0]'.   Thus,  the  test  for  lack  of  fit 
is  lower  tailed  if  Si^-:*  *    0. 

For  values  of  3„  that  produce  an  upper  tailed  test  we 
use  the  check  points  (0.418,  0.277,  0.305)  and  (0.277, 
0.418,  0.305)  with  the  F  ratio 


d'v/d/2 
F  =  - 


MSE 


and  conclude  there  is  lack  of  fit  if  the  calculated  value  of 
F  exceeds  F^,2  i*  ^^1^   values  of  3^2  that  produce  a  lower 
tailed  test,  we  use  two  replicates  of  the  check  point 
(0.053,  0,  0.947),  and  conclude  there  is  lack  of  fit  if  F  is 
less  than  F         ,  where  again  F  is  calculated  by 
F  =  (d'Vg-'-d/2)/MSE. 

Example  4.   Example  4  illustrates  Case  3  of  Section 
3.3.4  in  which  MSE  is  used  to  estimate  a^  in  the  lack  of  fit 
test  statistic.   A  second  order  canonical  polynomial  model 
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in  three  mixture  components  is  fitted  using  the  {3,3} 
simplex  lattice  design,  which  appears  in  Figure  5.   The  true 
model  is  assumed  to  be  special  cubic,  thus  p2  =  1  and  An  is 
a  scalar.   The  {3,3}  design  consists  of  ten  design  points 
and  since  there  are  six  parameters  to  be  estimated  in  the 
fitted  model,  o^  can  be  estimated  by  MSE  with  N  -  p  =  10  -  6 
=  4  degrees  of  freedom. 

We  first  suppose  that  a  single  check  point  is  to  be 
used  to  test  for  lack  of  fit.   Using  the  Controlled  Random 
Search  Procedure  we  find  the  single  check  point  that 
maximizes  the  scalar 


A^  =  (X*  -  X*A) 'Vq-'-(X*  -  X*A) 


is  located  at  the  centroid  of  the  simplex  factor  space. 
Thus  (X*,  X*,  x*)  =  (1/3,  1/3,  1/3)  is  the  optimal  candidate 
for  an  upper  tailed  test  check  point.   At  this  centroid 
point,  A-^   =  4.9076  x  10"^.   For  the  {3,3}  design  the  scalar 
quantity  A2  =  (X2  -  XA)'(X2  -  XA)  is  fixed  and  is  equal  to 
A2  =  9.4062  X  10"'*  and  thus,  R  =  [Aj^/k]  -  [A2/(N  -  p)]  = 
[(4.9076  X  10"^)/1]  -  [(9.4062  x  10"^)/4]  =  2.5560  x  10""*. 

The  point  that  is  the  optimal  candidate  for  a  lower 
tailed  test  check  point  is  chosen  randomly  from  a  subregion 
of  points  in  the  factor  space,  in  which  all  points  minimize 
Aj^.   The  point  selected  has  the  value  (x?,  xi,  xJ)  =  (0.560, 
0.410,  0.030).   Here  A-^   =  9.6590  x  10"'^  and  R  =  [(9.6590  x 
10"'7)/1]  -  [(9.4062  X  10-4)/4]  =  -2.3419  x  10*4. 


(ff°) 


(i'fo) 


(1,0,0) 


(0,1,0) 


{i'°'i) 


(0,0,1) 


Figure    5.      The    {3,3}    simplex    lattice    design 
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Since  R  is  positive  for  the  optimal  upper  tailed  test 
check  point  (1/3,  1/3,  1/3)  and  R  is  negative  for  the 
optimal  lower  tailed  test  check  point  (0.560,  0.410,  0.030) 
we  are  in  Case  3  of  Section  3.3.4.   Either  the  upper  or 

lower  tailed  test  could  be  used  to  test  for  lack  of  fit,  but 

2 
if  the  quantity  S'B_/a   can  be  specified,  then  we  will 

choose  to  use  the  test  that  has  greater  minimum  power,  since 

greater  power  means  that  we  are  more  likely  to  detect  lack 

of  fit  when  in  fact  lack  of  fit  exists.   In  this  example 

^2=  ^123- 

For  illustrative  purposes,  we  arbitrarily  choose 

2 
3 '3  /a   =  2000,  so  that  an  approximate  conservative  lower 

bound  for  the  power  of  the  upper  tailed  test  is  found  by 
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calculating 


P  I  F"  >  F        1 

^  k,N-p;A  j^^,X2u     a;k,N-pJ 


where  F   ,,  ,,  _  is  the  upper  100a  percentage  point  of  the 
a  ;  K ,  N— p 

central  F  distribution,  k  is  the  number  of  check  points,  N 
is  the  total  number  of  response  observations,  p  is  the 
number  of  parameters  in  the  fitted  model, 

^IL  =  ^min^-2^-2/2^^'  ^"d  A^u  =  'S^ax^2^2/2<'^-   The 

quantity  y  ^^   is  the  smallest  eigenvalue  of  Aj^,  where  Ai  is 

evaluated  at  the  optimal  upper  tailed  test  check  point. 

Since  A,  is  a  scalar,  y  .   =  A^ .   Likewise,  6     is  the 
-L  mm     1  max 

largest  eigenvalue  of  A2,  and  since  in  this  example  A2  is  a 

scalar,  6     =  A..   In  this  example  we  have  k  =  1,  N  -  p  = 
max     2 

10  -  6  =  4,  \         =   u^.  SiB-/2a^  =  (4.9076  x  lO""^ )( 2000/2 ) 
=  4.9076  X  10-1,  and  X  ^^    =   &^^^&.^S,2/2o^ 
=  (9.4062  X  10-^)(2000/2)  =  9.4062  x  lo'"*".   Using  the 
approximation  to  the  cumulative  probabilities  of  the  doubly 
noncentral  F  distribution  given  by  Johnson  and  Kotz  (1970, 
p. 197)  which  is  described  in  Appendix  6,  and  taking  a  =  .05, 
we  find  that  a  conservative  lower  bound  for  the  power  of  the 
optimal  upper  tailed  test  is  approximately  equal  to  .0649. 

The  minimum  power  for  the  optimal  lower  tailed  test  is 

2 
approximated  (assuming  3A3-,/a   =  2000)   by  calculating 

^  l^k,N-p;Aj^y,X2L  '^    ^(  1-a  )  ;k,N-p^  * 
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The  quantities  \ ^^   and  x ^^   are  taken  as  x         =  y    B'g  /2a^ 

2 

and  X-   =  6^.  el3o/2o   where  u  ^   is  the  largest  eigenvalue 
zL    min-2-2  max  ^       ^ 

of  A]^  with  Aj^  calculated  using  the  optimal  lower  tailed  test 

check  point,  and  where  6  .   is  the  smallest  eigenvalue  of 

mm  ^ 

A^.   Since  A-i  and  An  are  scalars,  u     =  A,  and  6  .   =  A  . 
^  J.      z  •^max     1      min     2 

In  this  example,  k  =  1,  N  -  p  =  4, 

X^^    =  (9.6590  X  10"'^)(2000/2)  =  9.6590  x  lo""*,  and 

X2L  =  (9.4062  x  lo""* )( 2000/2 )  =  9.4062  x  lo"""-.   Again  if  the 

approximation  to  the  doubly  noncentral  F  distribution  given 

in  Johnson  and  Kotz  is  used,  an  approximate  conservative 

lower  bound  for  the  power  of  the  optimal  lower  tailed  test 

is  .0555. 

Having  specified  l^^^^a'^    =  2000,  the  optimal  upper 
tailed  test  is  chosen  over  the  optimal  lower  tailed  test, 
because  the  approximate  minimum  power  of  the  upper  tailed 
test  is  greater  than  the  approximate  minimum  power  of  the 
lower  tailed  test.   Using  the  optimal  upper  tailed  test 
check  point  (1/3,  1/3,  1/3)  in  the  test  statistic 


MSE 


we  conclude  that  lack  of  fit  is  significant  if  the 

calculated  value  of  F  exceeds  F   ,  , ,  in  which  case  we 

a  ;  1 ,  4 

reject  HQ:X-L=X2  =  0in  favor  of  H^  :   X  j^/1  -  X  2/4  >  0 . 

When  two  simultaneous  check  points  are  used  for  testing 
lack  of  fit,  the  Controlled  Random  Search  Procedure  locates 
the  optimal  upper  tailed  test  and  optimal  lower  tailed  test 
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check  points.   It  turns  out  that  two  replicates  at  (1/3, 
1/3,  1/3)  maximize  A^,  and  are  used  as  optimal  check  points 
for  an  upper  tailed  test.   The  value  of  R  =  [A^/2]  -  [A2/4] 
is  [(7.9210  X  10"'^)/2]  -  [(9.4062  x  10"'^)/4]  =  1.6090  x 
10-4. 

In  searching  for  two  optimal  lower  tailed  test  check 
points,  again  a  subregion  of  the  factor  space  is  found  in 
which  any  of  the  points  nearly  minimize  A-,.   From  this 
subregion  are  chosen  the  points  (0.6386,  0.3263,  0.0351)  and 
(0.7257,  0.2421,  0.0322)  resulting  in  a  value  of  R  =  [A.j^/2] 
-  [A2/4]  of  [(1.5216  x  10-9)/2]  -  [(9.4062  x  10-4)/4]  = 
-2.3516  X  10-4. 

In  conclusion,  when  two  simultaneous  check  points  are 
used  in  the  test  for  lack  of  fit  in  this  example,  R  is 
positive  for  the  optimal  upper  tailed  test  and  R  is  negative 

for  the  optimal  lower  tailed  test,  and  we  have  Case  3  of 

2 
Section  3.3.4.   Selecting  3A3^/a   =  2000  arbitrarily,  we 

found  the  approximate  lower  bound  for  the  power  of  the  upper 

tailed  test  to  be  .0504,  and  the  approximate  lower  bound  for 

the  power  of  the  lower  tailed  test  to  be  .0612.   Since  the 

power  is  higher  with  the  lower  tailed  test  it  is  our  choice 

for  testing  lack  of  fit  when  two  check  points  are  used 

simultaneously.   Lack  of  fit  is  detected  and  we  reject 

H  :  X   =  X   =  0  in  favor  of  H  :  [X  /2]  -  [X  /4]  <  0  if  the  F 

U     X      ^  Si  1.  ^ 

ratio,  F  =  (d'V~  d/2)/MSE,  using  the  optimal  lower  tailed 

test  check  points  (0.6386,  0.3263,  0.0351)  and  (0.7257, 

0.2421,  0.0322)  is  calculated  to  be  less  than  F         . 

(  ±~ct  J  ;  z  ,  '1 
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3.4.2   Numerical  Examples 

Numerical  Example  1.   In  this  example  we  illustrate 
numerically  some  of  the  findings  in  the  first  theoretical 
example  of  Section  3.4.1.   Data  that  were  collected  in  a 
rocket  fuel  experiment  (Kurotori,  1966)  will  be  used  to 
investigate  the  power  of  the  lack  of  fit  F  test.   The  test 
is  set  up  to  detect  the  inadequacy  of  a  fitted  second  order 
canonical  polynomial  model  when  the  true  model  is  special 
cubic.   Calculated  values  of  the  power  of  the  test  which 
detects  lack  of  fit  through  large  values  of 

d'v""'-d/k 
F  =  —   " 


^2 
°ext 


will  be  compared  for  several  check  point  locations,  includ- 
ing the  location  (1/3,  1/3,  1/3)  at  which  the  power  was 
found  to  be  maximum  in  Example  1  of  Section  3.4.1. 

In  Kurotori 's  experiment  the  modulus  of  elasticity  (Y) 
of  a  rocket  fuel  is  expressed  as  a  function  of  the 
proportions  of  three  components — binder  (x^) ,    oxidizer  (X2)/ 
and  fuel  (X3).   Since  lower  bounds  are  placed  on  the 
component  proportions  x^,    x^,    and  X3,  in  the  form  of 
0.20  <  x^,    0.40  <  x^/  and  0.20  <  x  ,  pseudocomponents  (x!) 
are  defined  in  terms  of  the  original  components  in  the  form 
of  xj  =  (x^  -  0.20)/(1  -  .80),  x^  =  (x^  -  0.40)/(l  -  .80), 
and  x^  =  (x^  -  0.20)/(l  -  .80).   The  true  special  cubic 
model  in  the  pseudocomponents,  which  is  obtained  by   using 
the  data  at  the  seven  points  of  the  simplex  centroid  design 
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in  the  pseudocomponent  system,  is 


E(Y)  =  2350X'  +  2450X'  +  2650x'  +  Ox'xl 

-i.  ^  J  X   ^ 


+  lOOOx^x^  +  leOOx^x'  +  6150x'x'x' 


The  second  order  canonical  polynomial  model  that  is  fitted 
to  the  six  boundary  points  only,  and  which  will  be  tested 
for  lack  of  fit,  is  given  by 


Y  =  2350x|  +  2450X'  +  2650x' 


+  lOOOx^x^  +  leOOx^x'. 


The  configuration  of  the  experimental  design  as  well  as  the 
check  point  locations  are  depicted  in  Figure  4  of  Chapter  2 
and  the  observed  response  values  are  given  in  Table  3  of 

this  chapter. 

-1    "2 
A  value  of  the  ratio  F  =  [d'V_  d]/a     is  calculated  at 

0  -^  ext 

each  of  the  four  individual  check  points  (1/3,  1/3,  1/3), 

(2/3,  1/6,  1/6),  (1/6,  2/3,  1/6),  and  (1/6,  1/6,  2/3) 

*  2  "2 

where  a   .  is  assumed  to  have  the  value  a      ^    =    144  as 
ext  ext 

suggested  by  Kurotori  (1966).   We  also  assume  without  loss 

of  generality  that  the  degrees  of  freedom  associated 

"  2 
with  o   ^  are  v  =  lo.   The  power  of  the  F  test  is  calculated 
ext 

at  each  of  the  four  check  points  by  using  the  approximation 
to  the  cumulative  probabilities  of  the  noncentral  F 
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Table  3.   Observed  Response  Values  at  the 
Pseudocomponent  Settings  for  Kurotori's  Rocket  Fuel 
Experiment — Numerical  Example  1. 


Observation    Binder   Oxidizer   Fuel  Modulus  of  Elasticity 


Number 

4 

4 

""3 

Y 

1 

1 

0 

0 

2350 

2 

0 

1 

0 

2450 

3 

0 

0 

1 

2650 

4 

1/2 

1/2 

0 

2400 

5 

1/2 

0 

1/2 

2750 

6 

0 

1/2 

1/2 

2950 

7* 

1/3 

1/3 

1/3 

3000 

8* 

2/3 

1/6 

1/6 

2690 

9* 

1/6 

2/3 

1/6 

2770 

10* 

1/6 

1/6 

2/3 

2980 

*Check  Points. 


distribution  given  by  Johnson  and  Kotz  (1970,  p.  197)  to 
evaluate 


Power  =  pIf'  ,»  ,   >  F  _c  ,  ,^} 
^  1, 10  ;X  ]^     .  05;1, 10' 


2     "^  1 

where  X,  =  A-g7-_/2a   ^.   The  value  of 
1     1  123    ext 

A^=  (X*  -  X*A) 'V~"''(X*  -  X*A)  is  calculated  for  each  check 
point  using  the  values  of  X*,  X*,  vl   and  the  value  of 
A  =  (X'X)~  X'X_  which  is  fixed  by  the  (3,2}  simplex  lattice 
design.   Since  the  {3,2}  simplex  lattice  consists  of  points 
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only  on  the  boundaries  of  the  triangle  (and  therefore  at 
each  point  at  least  one  of  the  x!  values  is  equal  to  zero), 
then  X2  =  0  and  A  =  0.   From  the  true  special  cubic  model, 

^123  =  ^^5°- 

The  calculated  value  of  F  as  well  as  the  approximate 

value  for  the  power  at  each  of  the  four  check  points  is 

given  in  Table  4.   The  check  point  (1/3,  1/3,  1/3)  produced 

the  highest  power  of  the  four  check  points  investigated, 

supporting  the  previous  results  of  Example  1  in  Section 

3.4.1  where  (1/3,  1/3,  1/3)  was  selected  as  the  check  point 

location  with  the  maximum  power  when  a  second  order 

canonical  polynomial  was  fitted  using  the  {3,2}  simplex 

lattice  design,  but  the  true  model  was  assumed  to  be  special 

cubic.   Additional  support  for  the  point  (1/3,  1/3,  1/3) 

being  optimal  is  given  by  the  contour  plot  of  values  of  A, 

in  Figure  6(d).   The  highest  values  of  A-^   appear  near  the 

centroid  (1/3,  1/3,  1/3)  where  high  A^  values  translate  into 

2     2 
high  \ -^   values,  since  ^i  =  ^1^123/^^  '  which  in  turn  implies 

high  power  since  we  know  the  power  is  an  increasing  function 

of  X  j^. 

As  a  second  part  of  this  example  the  power  of  the  F 

test  that  is  obtained  when  three  replicates  are  taken  at 

(1/3,  1/3,  1/3)  is  compared  to  the  power  of  the  F  test  that 

is  obtained  when  one  replicate  is  taken  at  the  three  check 

points  (2/3,  1/6,  1/6),  (1/6,  2/3,  1/6),  and  (1/6,  1/6, 

2/3).   These  latter  three  point  locations  were  suggested  by 

Kurotori  for  testing  lack  of  fit  of  his  fitted  special 
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cubic  model.   The  result  of  this  comparison,  see  Table  4,  is 
that  the  three  replicates  at  (1/3,  1/3,  1/3)  produce  the 
test  with  greater  power  which  again  supports  the  findings  of 
Example  1  of  Section  3.4.1. 

All  of  the  check  point  locations  listed  in  Table  4 
produce  very  high  power  values  (>  .999)  which  is  due  in  part 
to  the  high  value  of  & ^^^    ^^123  "  6150).   If  3     were  of 
lower  magnitude,  then  the  three  replicates  at  (1/3,  1/3, 
1/3)  would  show  a  still  greater  superiority  in  the  power 
value  compared  to  the  power  using  the  other  check  points. 
This  superiority  is  demonstrated  in  Table  5  where  values  of 
3^23  3^^  listed  as  3000  and  1500  and  the  comparative  power 
values  are  listed  as  0.998  compared  to  0.795  and  0.662 
compared  to  0.249,  respectively.   Table  5  also  demonstrates 
the  superior  power  value  for  the  point  (1/3,  1/3,  1/3)  when 

^123  ~  ^^^^    °^  ^123  ^  1500  and  each  of  the  four  check  points 
is  used  one  at  a  time. 

Finally,  (1/3,  1/3,  1/3)  being  the  optimal  check  point 
location  is  seen  in  Figure  6(c),  where  contour  plots  of  the 
expected  difference  in  the  heights  of  the  surfaces  are 
drawn.   The  differences  in  the  heights  are  found  by 
subtracting  the  estimated  height  of  the  surface  obtained 
with  the  fitted  second  order  model  from  the  estimated  height 
of  the  surface  obtained  with  the  true  special  cubic  model. 
The  expected  difference  between  the  true  and  fitted  surfaces 
approaches  a  maximum  the  closer  one  moves  to  the  centroid  of 
the  simplex  factor  space,  so  that  the  optimal  check  point 
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(a)  True    special   cubic    surface. 


(b)    Fitted    second    order  surface. 


X.  =1 


(c)  Expected   difference   between  the 
true    special   cubic  surface  and 
the   fitted  second  order  surface. 


(d)  A,=(X*  -  X*A)'  Vq'(X*-X*A 


Figure  6.   Contour  plots  for  Numerical  Example  1. 
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location  (1/3,  1/3,  1/3)  coincides  with  the  point  where  the 
expected  difference  between  the  true  special  cubic  surface 
and  the  fitted  second  order  surface  is  maximum. 

Numerical  Example  2.   In  this  second  numerical  example, 
we  investigate  the  power  of  the  F  test  for  detecting  lack  of 
fit  when  a  second  order  canonical  polynomial  model  is  fitted 
in  a  mixture  system  which  is  in  truth  represented  by  a 
special  cubic  model.   The  true  model  is  assumed  to  be 


E(Y)  =  2350X   +  2450X   +  2650x 

■^  ^  J 


+  lOOOXj^x^  +  leOOx^x^  +  6150x  X  x 


which  is  used  to  generate  hypothetical  response  observations 
at  the  seven  points  of  the  q  =  3  simplex  centroid  design  as 
well  as  at  three  check  points.   The  values  of  the  response 
are  obtained  by  adding  the  value  of  a  pseudorandom  normal 
variate  with  mean  0  and  variance  144  to  each  true  predicted 
response  value.   The  data  are  given  in  Table  6. 

The  response  values  at  the  seven  points  of  the  simplex 
centroid  design  are  used  in  the  least  squares  normal 
equations  to  obtain  the  fitted  second  order  model 


Y  =  2341x^  +  2438X2  +  2630x2 


+  310x^X2  +  1304x^^X2  +  1970x  x 
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Table  6.  Generated  Response  Values — Numerical  Example  2, 

Xi  X2  X3               Y 

10  0  2357 

0  10  2454 

0  0  1  2646 

1/2  1/2  0  2403 

1/2  0  1/2  2747 

0  1/2  1/2  2962 

1/3  1/3  1/3  3013 

*    1/3  1/3  1/3  2993 

2/3  1/6  1/6  2693 

.02  .93  .05  2550 


* 


■Check  points 


which  is  to  be  tested  for  lack  of  fit  using  the  test 
statistic  F  =  d'v"  d/MSE.   The  F  statistic  will  be  evaluated 
at  each  of  the  three  check  points  (1/3,  1/3,  1/3), 
(2/3,  1/6,  1/6),  and  (.02,  .93,  .05),  taken  one  at  a  time, 
and  the  power  of  the  test  at  the  three  check  point  locations 
will  be  calculated  and  compared.   The  test  is  lower  tailed 
for  all  check  point  locations  (since  R  =  A-,  -  A2  is  negative 
for  all  check  point  locations)  and  thus  the  power  is  defined 
as 


pi  F"  <  F        1 

^  l,l;Xi ,X2     .95;1,1^ 
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X  -I 


(.02  ,.93, .05  ) 


X  =1 

2 


x  =1 
3 


Figure  7.   Contours  of  R  =  Aj^  -  A2  for  Numerical  Example  2, 


2        2  2        2 

The  values  of  X  ,  =  3  123^1'^^''  ^^^    ^2    ~   ^123^2'^^''   ^^^  found 

2 
by  taking  6  ■,  j-^  =  6150  and  a      =  144.   The  results  of  this 

power  investigation  are  given  in  Table  7. 

Since  the  check  point  (.02,  .93,  .05)  produces  the 

greatest  power  of  the  three  check  points  investigated,  this 

supports  the  result  in  Example  2  of  Section  3.4.1,  where  we 

saw  that  the  point  (.02,  .93,  .05)  yielded  the  maximum  power 

of  all  points  for  detecting  lack  of  fit  of  a  fitted  second 

order  canonical  polynomial  model,  using  the  q  =  3  simplex 

centroid  design  in  the  presence  of  a  true  special  cubic 

surface.   Additional  evidence  for  the  point  (.02,  .93,  .05) 

being  an  optimal  check  point  is  shown  in  Figure  7,  where 

contours  of  the  values  of  R  =  A-j^  -  A2  are  presented .   The 

point  (.02,  .93,  .05)  is  seen  to  belong  to  an  area  of  the 
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simplex  factor  space  where  R  is  minimum,  which  implies  that 
At  (and  in  turn  X-.)  is  also  minimum  in  this  area,  since  R  = 
^1  "  ^^2  ^^'^    ^2   ^^^  ^^^  fixed  value  of  A2  =  .00084  for  the 
simplex  centroid  design.   Thus  the  check  point  (.02,  .93, 
.05)  produces  a  minimal  X  ■,    value  and  maximum  power,  since 
the  power  increases  with  decreasing  values  of  X-^- 

3.5   Discussion 
When  check  points  are  used  for  testing  lack  of  fit  in  a 
mixture  model,  the  appropriate  testing  procedure,  assuming  a 

normally  distributed  response,  involves  an  F  statistic.   If 

"  2 
an  external  estimate,  a      *-'    °^    ^^^   experimental  error 

variance  is  available  so  that  the  test  statistic  is  given  by 


d-v^Vk 

F  =  — 


'2 
^ext 


then  the  power  of  the  test  for  lack  of  fit  is  maximized  by 
choosing  k  check  points  that  maximize  the  value  of  the  non- 
centrality  parameter  X-^,      When  P2  =  If  maximizing  X -^    is 
achieved  without  knowing  the  value  of  the  elements  of  g   by 
selecting  check  points  that  maximize  the  scalar  A^ .   When 
P2  >  If    the  maximization  of  X  j_  is  approximated  by  maximizing 
a  lower  bound  for  Xj_.   This  is  achieved  also  without  knowing 
the  values  of  the  elements  of  6   by  selecting  check  points 
that  maximize  the  smallest  eigenvalue  of  the  matrix  A-j^ .   The 
test  is  upper  tailed,  and  for  given  values  of  Xi,  the  actual 
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power  of  the  test  can  be  calculated  from  the  cumulative 
probabilities  of  the  noncentral  F  distribution.   A  problem 
arises  when  A-^    is  positive  semi-definite  and  its  smallest 
eigenvalue  is  equal  to  zero.   In  this  case  check  points  that 
maximize  the  smallest  positive  eigenvalue  of  A-i  are 
selected,  and  lack  of  fit  is  only  detectable  for  a  subset  of 
the  possible  values  of  the  elements  of  g  . 

When  an  external  estimate  of  a^    is  not  available, 
testing  lack  of  fit  at  the  check  points  is  further 
complicated.   The  F  statistic  is 


F  =  - 


MSE 


and  the  rejection  region  for  the  lack  of  fit  test  can  be 
upper  tailed  or  lower  tailed.   The  power  of  the  test  is 
determined  by  using  the  doubly  noncentral  F  distribution, 
which  depends  on  the  parameters  k,  N  -  p,  X^,    and  A  2-   Of 
these  four  parameters,  only  k  and  X-^   are  influenced  by  check 
points,  and  if  the  value  of  k  is  fixed,  the  power  of  the 
test  is  maximized  by  choosing  check  points  that  affect  the 
value  of  X-j^*   Regardless  of  the  values  of  the  elements  of 
0  ,  check  points  that  maximize  X,  are  selected  for 
maximizing  the  power  of  an  upper  tailed  test,  and  check 
points  that  minimize  X  •,  are  selected  for  maximizing  the 
power  of  a  lower  tailed  test. 
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Lack  of  fit  can  be  tested  with  the  upper  tailed  test 
for  all  nonzero  values  of  the  elements  of  g   if  the  check 
points  are  selected  so  that  [A-^/k]    -    [A2/(H  -  p)]  is 
positive  definite,  since  this  forces  the  expected  value  of 
the  numerator  mean  square  in  the  F  ratio  to  be  greater  than 
the  expected  value  of  the  denominator  mean  square. 
Similarly,  lack  of  fit  can  be  tested  with  a  lower  tailed 
test  if  check  points  are  selected  which  make  [A,  /k]  - 
[A2/(N  -  p)]  negative  definite.   When  it  is  not  possible  to 
select  check  points  that  make  [A-^/k]    -    [A^/CN  -  p)]  either 
positive  definite  or  negative  definite  then  detection  of 
lack  of  fit  is  only  possible  for  a  subset  of  all  nonzero 
values  of  the  elements  of  3  . 

The  power  of  the  test  for  lack  of  fit  using  the  F 
statistic  in  (3.8)  is  a  function  of  X^   and  X2*   Since  the 
magnitudes  of  Xj^  and  X2  are  influenced  by  the  experimental 
design,  an  area  for  future  study  is  the  investigation  of  the 
effect  of  the  experimental  design  on  the  power  of  the  lack 

of  fit  test.   In  the  presence  of  an  external  estimate  of 

2 
a    ,    Atkinson  (1972)  suggested  designs  that  maximize  the 

determinant  of  A2,  |A2i,  when  lack  of  fit  is  to  be  detected 

by  a  large  value  of  X   using  a  procedure  which  is,  in 

general,  equivalent  to  the  lack  of  fit  testing  procedure 

that  partitions  the  residual  sum  of  squares  into  pure  error 

and  lack  of  fit  sums  of  squares.   It  might  be  useful  to 

apply  Atkinson's  (1972)  methodology  not  only  to  |A  |,  but 

to  \h^\    or  \h^/k    -   A2/(N  -  p)|  in  order  to  find  an 
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appropriate  design  when  testing  lack  of  fit  with  the  F  ratio 
in  Eq.  (3.8).   Since  the  power  of  the  test  in  Eq.  (3.8)  is 
also  affected  by  the  values  of  k  and  N  -  p,  which  are  the 
numerator  and  denominator  degrees  of  freedom  of  the  doubly 
noncentral  F  distribution,  respectively,  optimal  settings 
for  these  parameters  can  also  be  considered.   For  a  given 
fitted  model,  p  is  fixed  so  that  the  degrees  of  freedom 
would  be  influenced  by  the  number  of  check  points  selected, 
k,  and  by  N,  the  total  number  of  observations.   Finally,  the 

experimental  design  and  the  number  of  check  points  also 

"  2 
affect  the  power  of  the  F  test  when  o   ^  is  used  to 

ext 

2 
estimate  a    .   Thus  the  effect  of  the  experimental  design  and 

the  number  of  check  points  can  also  be  investigated  for  the 

situation  where  the  lack  of  fit  test  statistic  is  given  by 

Eq .  (  3 . 3  ) . 

We  now  conclude  our  investigation  of  the  check  point 

approach  to  lack  of  fit  testing  and  in  the  next  chapter  turn 

to  an  investigation  of  a  near  neighbor  method  for  testing 

lack  of  fit. 


CHAPTER  FOUR 

USE  OF  NEAR  NEIGHBOR  OBSERVATIONS 

FOR  TESTING  LACK  OF  FIT 

4.1   Introduction 


In  an  experiment  in  which  replicate  response 
observations  are  available  at  one  or  more  design  points, 
lack  of  fit  of  a  fitted  model  can  be  tested  by  a  procedure 
which  involves  partitioning  the  residual  sum  of  squares  into 
two  statistically  independent  portions.   One  portion  is  the 
sum  of  squares  due  to  lack  of  fit  (SSj^Qp),  and  the  second 
portion  is  the  sum  of  squares  due  to  pure  error  (SSEp^_,j.g) 
obtained  from  the  replicates.   As  discussed  in  Section  2.2, 
this  procedure  was  suggested  by  Draper  and  Smith  (1981, 
p. 120).   Lack  of  fit  is  inferred  if  the  calculated  value  of 
the  ratio 

""^LOF 

^=^iSE (4.1) 

pure 

exceeds  the  corresponding  upper  100a  percentage  point  of  the 
central  F  distribution,  where  MSlof  ^"d  MSEp^j-g  are  the  mean 
square  values  found  by  dividing  SSlof  ^""^  ^^^pure  ^^  their 
respective  degrees  of  freedom. 

In  order  to  test  the  fitted  model  for  lack  of  fit  when 
replicate  observations  are  not  available,  Shillington  (1979) 
suggested  a  procedure  which  uses  observed  response  values 
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collected  at  points  which  are  "near  neighbors"  in  the  factor 
space  in  place  of  replicates  (see  Section  2.3).  Lack  of  fit 
is  inferred  when  the  calculated  value  of  the  ratio 


MSEg 

^  ^   MSE„  (4.2) 

W 


exceeds  the  upper  100a  percentage  point  of  the  central  F 
distribution.   The  numerator,  MSEg,  of  the  F  ratio  in  Eq. 
(4.2)  is  a  generalization  of  the  numerator,  MSlqf'  °^  *-^^  ^ 
ratio  in  Eq.  (4.1).   The  form  of  MSEg  will  be  given  in  Eq. 
(4.5)  of  Section  4.3.   The  denominator,  MSE^^,  in  Eq.  (4.2) 
is  a  generalization  of  MSEp^j.^  in  Eq.  (4.1),  and  the  value 
of  MSEy^  is  calculated  using  near  neighbor  observations  in 
place  of  replicates  (see  Eq.  (4.6)  in  Section  4.3). 

Shillington 's  near  neighbor  method  provides  an 
alternative  to  the  check  points  method  when  replicate 
observations  are  not  available.   Typically,  near  neighbors 
might  appear  either  because  an  experiment  was  not  designed 
to  provide  replicate  observations  or  with  a  designed 
experiment  consisting  of  a  large  number  of  design  points  in 
a  bounded  factor  space  which  results  in  some  points  being 
near  one  another. 

In  this  chapter  we  shall  further  study  Shillington • s 
(1979)  near  neighbor  procedure  for  testing  lack  of  fit.   A 
question  involving  the  correctness  of  the  ordinary  least 
squares  technique  suggested  by  Shillington  for  deriving  the 
denominator,  MSE^,  of  the  F  ratio  in  Eq.  (4.2)  will 
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be  raised.   The  question  will  be  resolved  by  showing  the 
equivalence  of  deriving  MSE^  by  ordinary  least  squares  and 
of  deriving  MSE^^  by  a  generalization  of  weighted  least 
squares.   We  will  verify  that  when  the  observable  response 

values  are  assumed  to  have  the  normal  distribution  with 

2 
homogeneous  variance,  a    ,    the  F  ratio  in  Eq.  (4.2)  possesses 

a  central  F  distribution  when  the  fitted  model  is  adequate, 

but  possesses  a  doubly  noncentral  F  distribution  when  the 

fitted  model  suffers  from  lack  of  fit.   We  shall  also  show 

that  the  F  test  for  lack  of  fit  which  uses  the  statistic  in 

Eq.  (4.2)  can  have  either  an  upper  tailed  or  a  lower  tailed 

rejection  region.   Finally,  the  use  of  a  clustering 

algorithm  for  defining  groups  of  near  neighbors  will  be 

proposed. 

4.2   Notation 

In  this  section  we  introduce  the  notation  to  be  used  in 

this  chapter.   Throughout  our  investigation  of  Shillington 's 

near  neighbor  procedure  for  testing  lack  of  fit,  we  shall 

assume  the  observed  response  values  collected  in  an 

experiment  can  be  grouped  into  g  cells  where  the  jth  cell 

contains  nj  observations,  j  =  1,  2,  ...,  g.   The 

observations  in  a  cell  are  from  points  that  are  "near 

neighbors"  in  the  sense  that  they  are  near  one  another  in 

the  factor  (mixture)  space.   A  model  of  the  form 


E(Y)  =  XBj_  (4.3) 
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is  fitted  to  the  data  using  ordinary  least  squares,  but  the 
true  model  is  assumed  to  have  the  form 

E(Y)  =  X3j^  +  X^^^,  (4.4) 

where  Y  is  an  Nxl  vector  of  response  values  observable  at 

2 

the  design  points  with  var(Y)  =  a    !„,  X  and  X2  are  Nxp  and 

Nxp2  matrices  of  known  constants,  respectively,  and  g,  and 
g^-  are  pxl  and  P2^1  vectors  of  unknown  regression 
coefficients,  respectively.   The  vector  Y  is  assumed  to  have 
the  normal  distribution. 

Let  us  now  define  the  following  vector  and  matrix 
quantities  to  be  used  in  developing  the  numerator,  MSEg,  of 
the  F  ratio  in  Eq.  (4.2): 

Y   =   a  gx 1  vector  with  jth  element  equal  to  the 

average  of  the  n^  observed  response  values  in  the 
jth  cell  of  near  neighbor  observations,  j  =  1,  2, 
.  .  . ,  g  . 

Xq  =   a  gxp  matrix  whose  jth  row  is  the  average  of  the 
nj  rows  of  X  corresponding  to  the  jth  cell,  j  = 
1,  z f    ...,  g. 

X2C  =  a  g><P2  matrix  whose  jth  row  is  the  average  of  the 
n^  rows  of  X2  corresponding  to  the  jth  cell,  j  = 
J.,  z r    ...,  g. 
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Gr.  =    a  gxg  diagonal  matrix  of  the  form 


G   =  diag[l/nj^,  l/n2/  •••/  l/"q]' 


To  further  illustrate  the  forms  of  Y^,  X^,  X2C'  ^rid  Gg/ 
we  present  the  following  numerical  example.   Consider  a  data 
set  consisting  of  response  observations  (Y)  taken  at  N  =  8 
different  combinations  of  the  settings  of  the  factors  xi    and 
X2,  where  the  eight  response  observations  are  divided  into 
g  =  4  near  neighbor  cells.   The  vector  of  observed  response 
values,  Y,  and  the  matrix  X  corresponding  to  the  first  order 
model,  E(Y)  =  3q  +  Pj^x^^  +  3  2^2'  ^^^ 


Y   = 


10 
13 
16 
15 
18 
21 
27 
30 


X    = 


112 

12         5 

12  4 

13  2 

13  1 

14  2 
I 5 5 

15  4 


The  horizontal  lines  in  Y  and  X  delineate  the  four  cells  of 
near  neighbors.   In  this  example 


^C  = 


10 

(13  +  16)/2 

(15  +  18  +  21)/3 

(27  +  30)/2 


10 

14.5 
18 
28.5 
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^C    = 


1 

(1    +    l)/2 

(l+l+l)/3 

(1    +    l)/2 


1 
1 
1 
1 


1 

(2  +  2)/2 
(3+3+4)/3 
(5   +    5)/2 


1 

2 
3.3 

5 


2 

(5  +  4)/2 
(2+l+2)/3 
(5    +    4)/2 


2 

4.5 
1.7 
4.5 


and 

Gq    =   diagd,    1/2,    1/3,    1/2). 

If    the    true   model    is    second    degree,    E{Y)    = 

2  2 

8q    +   Bj^Xj^    +   3  2^2    "*"   ^12^1^2    '*'    ^11^1    '*'    ^22^2'    ^^^"    ^^^    ^2    ^"^ 

X2C  matrices  have  three  columns  corresponding  to  the 

2        2 
terms  x,Xp,  x, ,  and  x-,  respectively.   For  this  numerical 

example  we  have 


^2  = 


and 


'2C 


2 

10 

8 

6 

3 

8 

25 

20 


2 
9 

5.7 
22.5 


1 

4 

4 

9 

9 

16 

25 

25 


1 
4 

11.3 
25 


4 

25 

16 

4 

1 

4 

25 

16 


4 

20.5 
3 
20.5 
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Next  let  us  define  the  following  quantities  to  be  used 
in  developing  the  denominator,  MSE^,  of  the  F  ratio  in  Eq. 
(4.2): 

W  =   an  Nx 1  vector  of  within  cell  deviations  where  the 
ith  element,  W^ ,  of  W  is  equal  to  the  difference 
between  the  ith  element,  Y-^/  of  Y  and  the  average 
of  the  response  values  observed  in  the  near 
neighbor  cell  containing  Y^ ,  i  =  1,  2,  . .  .  ,  N. 
X^  =    an  Nxp  matrix  whose  ith  row  is  equal  to  the  ith 
row  of  the  X  matrix  minus  the  row  of  X^ 
corresponding  to  the  cell  containing  the  response 
value  observed  at  the  ith  row  of  X. 
r  =    rank(X^^). 
^2W  ~   ^"  Nxp2  matrix  whose  ith  row  is  equal  to  the  ith 
row  of  the  X2  matrix  minus  the  row  of  X2P 
corresponding  to  the  cell  containing  the  response 
value  observed  at  the  ith  row  of  X. 
Zq  =    an  NxN  idempotent  matrix  of  the  form 

^0  "  -^N  "  di^gni/n^)J^,  (l/n2)J2/  . . .  ,  ( 1/n  )J  ] 

where  Jj  is  an  njxnj  matrix  of  ones,  and  If^  is 

g 
the  identity  matrix  of  order  NxN,  with  N  =   En. 

j=l  ' 

Let  us  illustrate  the  forms  of  W,  X^,  X2^^,  and  J:  q  by 
using  the  numerical  example  presented  earlier  in  this 
section,  where  the  eight  response  observations  were 
distributed  among  four  cells.   For  these  data  we  have 
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W  = 


^w   - 


X 


2W 


~ 

"^ 

~" 

10  - 

10 

0 

13  - 

29/2 

-1.! 

5 

16  - 

29/2 

1. 

5 

15  - 

18 

= 

-3 

18  - 

18 

0 

21  - 

18 

3 

27  - 

57/2 

-1. 

5 

30  - 

57/2 

1.! 

5 

/ 

^^ 

f- 

1-1 

1 

-  1 

2  - 

2   "I 

0 

0 

0 

1-1 

2 

-  2 

5  - 

9/2 

0 

0 

.5 

1-1 

2 

-  2 

4  - 

9/2 

0 

0 

-.5 

1-1 

3 

-  10/3 

2  - 

5/3 

= 

0 

-.3 

.3 

1-1 

3 

-  10/3 

1  - 

5/3 

0 

-.3 

-.7 

1-1 

4 

-  10/3 

2  - 

5/3 

0 

.7 

.3 

1-1 

5 

-  5 

5  - 

9/2 

0 

0 

.5 

1-1 

5 

-  5 

4  - 

9/2 

0 

0 

-.5 

1 

2-2 

1  - 

1 

4 

-  4 

0 

0 

0 

10-9 

4  - 

4 

25 

-  41/2 

1 

0 

4.5 

8-9 

4  - 

4 

16 

-  41/2 

-1 

0 

-4.5 

6  -  17/3 

9  - 

34/3   4 

-  3 

= 

.3 

-2. 

3    1 

3  -  17/3 

9  - 

34/3   1 

-  3 

-2.7 

-2. 

3   -2 

8  -  17/3 

16  - 

34/3   4 

-  3 

2.3 

4. 

7    1 

25  -  45/2 

25  - 

25 

25 

-  41/2 

2.5 

0 

4.5 

20  -  45/2 

25  - 

25 

16 

-  41/2 

-2.5 

0 

-4.5 

and 


0 

0 

0 

1/2 

0 

-1/2 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0  0     0     0 

■1/2  0     0     0 

1/2  0     0     0 

0  2/3  -1/3  -1/3 

0  -1/3   2/3  -1/3 

0  -1/3  -1/3   2/3 


0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

/2 

-1/2 

0 


0   -1/2   1/2  J 


4.3  Shillington's  Procedure 
We  originally  described  Shillington's  near  neighbor 
procedure  for  testing  lack  of  fit  in  Section  2.3  of  Chapter 
Two.   We  now  reintroduce  the  procedure  by  using  the  matrix 
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and  vector  quantities  defined  in  Section  4.2.   The  test  that 
Shillington  proposed  involves  the  use  of  an  F  ratio  (see  Eq. 

(4.2))  of  two  statistically  independent  mean  square  values, 

2 
each  of  which  is  an  unbiased  estimate  of  a      when  the  fitted 

model  is  the  correct  true  model.   The  two  independent  mean 

2 
squares  become  biased  estimates  of  a   when  the  fitted  model 

suffers  from  lack  of  fit.   Shillington 's  methodology  detects 

lack  of  fit  when  the  calculated  value  of  the  F  ratio  in  Eq. 

(4.2)  is  large,  thus  his  test  is  upper  tailed.   We  shall  see 

later  in  Section  4.7  that  the  test  is  not  always  upper 

tailed,  and  may  be  lower  tailed.   Shillington  points  out 

that  the  power  of  the  test  depends  on  the  relative 

magnitudes  of  E(MSE3)  and  E(MSE^^),  that  is,  the  power 

depends  on  the  difference  between  the  expected  values  of  the 

numerator  and  of  the  denominator  in  the  F  ratio  in  Eq. 

(4.2).   We  shall  be  more  specific  than  Shillington  by 

discussing  the  power  of  the  test  in  terms  of  parameters  of 

the  doubly  noncentral  F  distribution. 

We  now  turn  to  defining  Shillington 's  test  statistic  in 

matrix  notation.   Shillington 's  F  ratio  takes  the  form  (see 

Eq.  (4.2)) 


SSE  /(g  -  p) 
F  =      ^ 


SSE^/(N  -  g  -  r) 


MSE3 


MSE^ 
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where  SSEg  is  the  residual  sum  of  squares  from  a  weighted 
least  squares  regression  analysis  in  which  Y   is  regressed 
on  Xq,    g  is  the  number  of  cells  of  near  neighbors,  p  is  the 
number  of  terms  in  the  fitted  model,  and  r  is  the  rank  of 
Xy^.   The  quantity  SSEg  can  be  written  as  the  quadratic  form 

(see  Graybill,  1976,  p.  329;  also  see  Draper  and  Smith, 
1981,  p.  109).   The  quantity  SSE^^  is  defined  as 


SSE^  =  W'[In  -  X„(X^'X„)  X^;,]W,  (4.6) 


where  (XJLX„)~  is  any  generalized  inverse  of  (XAX  ).   [A 
matrix  A~  is  defined  as  a  generalized  inverse  of  the  matrix 
A  if  AA~A  =  A.]   The  quadratic  form  SSE^  is  the  residual  sum 
of  squares  from  an  ordinary  least  squares  regression 
analysis  in  which  W  is  regressed  on  X^. 

In  the  next  two  sections  we  shall  discuss  the 
development  of  the  numerator  and  denominator,  MSE3  and  MSE^, 
respectively,  of  the  F  ratio  given  in  Eq.  (4.2).   We  then 
suggest  an  alternative  representation  for  MSE^  which  relies 
on  a  generalization  of  weighted  least  squares.   This 
alternative  representation  for  MSE^  will  be  shown  to  be 
equivalent  to  Shillington 's  expression  for  MSE^. 


109 
4.3.1   Development  of  MSEg 


The  quantity  MSEg  =  SSEg/Cg  -  P)  is  the  numerator  of 
the  F  ratio  in  Eq.  (4.2).   As  mentioned  in  Section  4.3,  the 
quantity  SSEg  is  the  residual  sum  of  squares  from  a  weighted 
least  squares  regression  analysis  in  which  Yp  is  regressed 
on  Xp.   The  weighting  is  appropriate  because 
var(Yp)  =  a  G„  not  only  when  the  fitted  model  is  adequate 
(under  model  (4.3)),  but  also  when  the  fitted  model  suffers 
from  lack  of  fit  (under  model  (4.4)).   In  order  to  further 
explain  the  (Yp,  Xp )  system,  we  define  the  matrix  M  as 

M  =  diag[(l/n^)l|,  ...,  (l/n^)!^] 

where   1 .  is  an   n-xl   vector  of  ones,  j  =1,  2,  ...,  g. 
Then  the  (Yp^  X  )  system  can  be  derived  as  a  linear 
transformation  of  the  (Y,  X)  system.   That  is,  application 
of  the  transformation  matrix  M  yields  the  following 
equalities 


Xc  =  ^'X' 


and 


X^  =  MX, 


^2C  ~  "^^2 


From  this  it  follows  that  var(Y  )  =  M  var(Y)M'  =  a^MM' 

2 
a    G_ ,  since  MM'  =  G..   Under  the  hypothesized  model  of 
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Eq.  (4.3),  E(Y^)  =  ME(Y)  =  MXB^  =  X^gj^,  whereas  under  the 
model  of  Eq.  (4.4),  E(Y^)  =  M(Xe^  +^2^2^  "  ^C^l  "^  ^2C§-2' 

We  now  consider  the  distribution  of  the  random  quantity 

2 
SSEg/a  .   It  can  be  shown  (Theorem  2,  Searle,  1971,  p.  57) 

that  under  the  model   of  Eq.  (4.3),  SSE_,/a^  possesses  a 

central  chi-square  distribution  with  g  -  p  degrees  of 

freedom,  but  that  under  the  model  of  Eq.  (4.4),  SSE  /a 

possesses  a  noncentral  chi-square  distribution  with  g  -  p 

degrees  of  freedom  and  noncentrality  parameter  n , ,  where 


"l  =  (l/2-^)§.2^2cfS^  -  S\(^C^0^^c)"^^cS^J^2C^-2-   ^'''^ 


Here  we  point  out  that  the  noncentrality  parameter  for 

2 
SSEg/a   given  by  Shillington  (1979)  is  not  correct  and 

should  be  written  as  in  Eq.  (4.7). 

Finally  we  note  that  SSEg  is  equivalent  to  the  usual 
lack  of  fit  sum  of  squares,  SSlqF'  "here  SSlqf/^^  ~  P)  = 
MSlqp  is  the  numerator  of  the  F  ratio  in   Eq.  (4.1),  when 
all  observations  in  each  cell  are  true  replicates  rather 
than  near  neighbors.   Shillington  pointed  out  this  fact,  but 
did  not  give  a  proof.   We  offer  a  proof  in  Appendix  7. 
4.3.2   Development  of  MSE^^ 

The  quantity  MSE^  =  SSE^^/(N  -  g  -  r),  where  r  denotes 
the  rank  of  X^^,  is  the  denominator  of  the  F  ratio  in  Eq. 
(4.2).   As  mentioned  in  Section  4.3,  the  quantity  SSE^  is 
the  residual  sum  of  squares  from  an  ordinary  least  squares 
regression  analysis  in  which  W  is  regressed  on  X^^.   Using 
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Theorem  2  (Searle,  1971,  p.  57)  and  noting  that 

W  =  E  Y  and  E^Z^  =  Z^,  it  can  be  shown  that  under  the 
-     0-       0  0     0 

2 
hypothesized  model  of  Eq.  (4.3),  SSE  /a   possesses  a  central 

chi-square  distribution  with  N  -  g  -  r  degrees  of  freedom, 

2 
but  under  the  model  of  Eq.  (4.4),  SSE.ya   possesses  a 

w 

noncentral  chi-square  distribution  with  N  -  g  -  r  degrees  of 
freedom  and  noncentrality  parameter  H „ ,  where 


^  =  (l/2a2)3.X'^tI^  -  X^(X^'X^)-X^]X2/_2.  (4.8) 


Shillington  (1979)  points  out  that  SSE^  reduces  to  the  usual 

pure  error  sum  of  squares,  SSEp^J.g,  when  all  cells  contain 

true  replicates.   This  is  easily  seen  by  using  the  fact  that 

X^  =  0  when  all  cells  are  composed  entirely  of  true 

replicates  so  that   SSE,,  =  WW  =  Y'E.I.Y  =  Y'l.Y  =  SSE 
^  W---0  0--0-      pure 

We  saw  in  Section  4.3.1  that  the  (Yp,  X„)    system  is 
derived  as  a  linear  transformation  of  the  (Y,  X)  system. 
Similarly,  the  (W,  X^)  system  can  be  derived  by  applying  the 
transformation  matrix  Zq.   Thus  W  =  ^^.Y,  X^^  =  ^^.X, 
and  X    =  ^p,X  .   It  follows  that  E(W)  =  2f.E(Y)  =  ^^XP 
=    X^S  ,  under  the  model  of  Eq.  (4.3),  and  E(W)  =  I     (X^ 
+  X  3  )  =  X^   +  X   g   under  the  model  of  Eq.  (4.4). 
Furthermore,  var(VjJ)  =  Z      var(Y)5:|^  =  (J^^q^q  =  <^^^o'  ^^^^^  ^0 
is  symmetric  and  idempotent. 

Since  the  variance-covariance  matrix  of  W  is  not  equal 
to  alj^,  for  some  positive  constant  a,  SSE^  should  have  been 
derived  as  the  residual  sum  of  squares  from  a  weighted  least 
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squares  regression  analysis  of  W  on  X^  rather  than  from  the 
ordinary  least  squares  regression  of  W  on  X^  that 
Shillington  (1979)  suggested.   We  shall  use  the  weighted 
least  squares  regression  of  W  on  X^  in  an  attempt  to  replace 
SSE^  in  the  F  ratio  in  Eq.  (4.2)  by  an  expression  we  will 
call  SSEy^(weighted)  .   We  later  show  that  SSE^^  and 

SSE^( weighted)  are  equivalent. 

2 
The  variance-covariance  matrix  of  W,  which  is  a    H ^,    is 

of  rank  N  -  g  and  is  thus  singular.   Therefore  the  residual 

sum  of  squares  from  a  weighted  least  squares  regression 

analysis  of  W  on  X^  which  is 


w'l^o^  -  z-%(x;i-%)-x;,z-i]w 


cannot  be  used  as  an  expression  for  SSE^^(  weighted ) ,  since 
Sq  does  not  exist.   The  problem  of  performing  a  weighted 
least  squares  regression  analysis  when  the  variance- 
covariance  matrix  of  W  is  singular  is  considered  in  the  next 
section. 

4.4   Development  of  SSEyj( weighted  ) 


2 
If  the  variance-covariance  matrix  of  W,  a  Eq,  is 

nonsingular  then  the  weighted  least  squares  formula  for 

SSEY^(weighted  )  is 


SSE^(weighted)  =  (W  -  X^p^^  ^ '^  o"^  ^  -  "  Vl^ 
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where  Q_^    is  a  solution  to  X^Eq~-'-X^3   =  X^J^z"-'-W  and  can  be 
written  as  B^^  =  ( X^  ~  X^)~X^J^Z  ^''-W.   The  quantity 

SSE^( weighted)  divided  by  the  appropriate  degrees  of  freedom 

2 
provides  an  unbiased  estimate  of  a      under  the  model  of  Eq. 

(4.3)  . 

However,  since  2:  q  is  singular,  the  weighted  regression 

formula  above  cannot  be  used  to  calculate  SSE^^(weighted )  . 

C.  R.  Rao  (1971,  1972,  and  1973)  suggests  an  analog  of 

weighted  least  squares  for  the  case  of  a  singular  variance- 

covariance  matrix.   Rao  suggests  the  existence  of  a  matrix  H 

such  that  6,  is  a  stationary  vector  value  of 

(W  -  X^0j^)'H(W  -  X^Bj^)  in  which  case  o^  may  be  estimated 

using 


a^  =  (W  -  X^3^)'H(W  -  X^3i)/(N  -  g  -  r) 


where  (N  -  g  -  r)  =  rank(EQ:X^)  -  rank(X^).   The  rank  of  the 
matrix  (EqTX^)  is  equal  to  N  -  g  because  X^  =  EqX^  so  that 
the  columns  of  X^  are  spanned  by  the  columns  of  z ^,    thus, 
rankCZ^rX^)  =  rank(EQ)  =  N  -  g. 

One  form  of  the  matrix  H  is  defined  (Rao,  1971  and 
1972)  as 


H  =  [Eq  +  c^X^X^]  (4.9) 


where  c  is  an  arbitrary  nonzero  constant,  so  that  with  the 
model  of  Eq.  (4.3),  o^  =  (W  -  X^g_j^)'H(W  -  X^^^)/(li   -  g  -  r) 
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2 
IS  an  unbiased  estimator  for  a  .   Thus  a  stationary  vector 

value  of  (W  -  X^3,)'H(W  -  ^^,)    is  given  by 

gj^  =  (X^HX^)~X^HW  .   Rao  indicates  (1972,  p.  3)  that  a^  is 

invariant  to  the  choices  of  the  generalized  inverses 

^  2 
involved  in  a    • 

Rao's  proofs  for  obtaining  an  unbiased  estimator 

*2       2 

a      for  a      are  not  given  m  detail.   Therefore  we  shall  state 

and  prove  the  following  theorem  which  will  be  used  to 
develop  an  expression  for  SSE^( weighted ) .   The  notation  A" 
will  be  used  to  denote  any  generalized  inverse  of  a  matrix 
A,  such  that  AA~A  =  A. 


2 
Theorem  4.1.   Let  Y  ~  (Xg,  a    G),  where  G  is  singular, 

then   a^  =  f""'-(Y  -  xe)'T~(y  -  Xg) 

2 
(i)    is  an  unbiased  estimate  of  a    ,  and 

(ii)   is  unique  with  probability  one,  and 

(iii)  is  a  scalar  multiple  of  a  central  chi-square 

variable  with  f  degrees  of  freedom  of  the  form 

2     2 

(a  /f )Xf  if  Y  has  the  multivariate  normal 

distribution. 
The  vector  Y  is  of  order  Nxl,  3  is  a  px 1  vector  of 
unknown  regression  coefficients,  X  is  an  Nxp  matrix  of 
known  constants,  G  is  an  NxN  positive  semi-definite 
matrix  of  known  constants,  T  =  G  +  XX' ,  3  is  any 
solution  to  X'T~xe  =  X'T~Y,  that  is,  3  =  {X'T~X)~X'T~Y, 
and  f  =  rank(G:X)  -  rank(X). 
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The  proofs  of  parts  (i),  (ii)  and  (iii)  of  Theorem  4.1  are 
given  in  Appendices  9,  10,  and  11,  respectively.   Lemma  4.1 
which  is  used  in  the  proof  of  Theorem  4.1  is  stated  and 
proved  in  Appendix  8. 

The  results  of  Theorem  4.1  can  now  be  applied  to  our 
problem  of  finding  an  expression  for  SSE^^(  weighted )  .   We 
define  SSE^( weighted )  as 

SSE^(weighted)  =  W-  [T"  -  T-X^(X^T-X^)-X^;,T-]  W      (4.10) 

where  Tq  =  S q  +  X^X^.   Writing  SSE^(weighted )  in  Eq.  (4.10) 
as  SSE^( weighted)  =  (W  -  X^ej^)'T~(W  -  X^3-,^),  from  Theorem 

4.1  we  see  that  if  the  true  model  is  of  the  form  in  Eq. 

2 
(4.3)  then  SSE^( weighted )/a   has  a  central  chi-square 

distribution  with  f  =  rank(z  :X^)  -  rank(X^)  =  N  -  g  -  r 

degrees  of  freedom.   However,  if  the  true  model  is  of  the 

form  in  Eq.  (4.4),  then  SSE^( weighted )/a^  has  a  noncentral 

chi-square  distribution  with  N  -  g  -  r  degrees  of  freedom 

and  noncentrality  parameter  n*,  where 


n*  =  (l/2a2)e.X^^[TQ  -  TqX„(X^TqX^)  X^Tq]X2^02 


The  distribution  of  SSE^( weighted )/a   under  model  (4.4)  is 
verified  by  the  following  theorem. 
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Theorem  4.2.   Let  Y  ~  N^(X3  +  X202'  a^G),  G  singular, 

^22      2         *  2     —1       *    — 
then  ta    /a      ~    Xf   ^    where  a   =f   (Y-XB)'T(Y-XB), 

f  =  rank(G:X)  -  rank(X),  T  =  G  +  XX',  and 
\    =    (l/2a^)0^X^[T~  -  T~X(X'T~X)~X'T~]X23  2- 


The  proof  of  Theorem  4.2  is  given  in  Appendix  12. 

4.5   Equivalence  of  SSE^  and  SSE^^( weighted ) 

In  this  section  we  shall  show  that  our  expression  for 
SSEy^  ( weighted )  in  Eq.  (4.10)  is  equal  to  Shillington '  s 
unweighted  SSE^  in  Eq.  (4.6).   Thus  the  complex  calculations 
required  for  evaluating  SSEy^(weighted )  can  be  avoided  by 
calculating  the  simpler  form  SSEy^. 

Zyskind  (1967)  investigated  conditions  under  which 
ordinary  least  squares  estimators  are  BLUE  (best  linear 
unbiased  estimators)  even  though  Y  in  the  model 

Y  =  X3  +  E,  where  E(e)  =  0,  does  not  have  variance- 

2 
covariance  matrix  equal  to  a  I .   Zyskind  assumes  that 

2 
var(Y)  =  a  V,  where  V  is  non-negative  and  possibly  singular, 

and  then  states  and  proves  the  following  necessary  and 

sufficient  condition  for  ordinary  least  squares  estimators 

to  be  BLUE. 


Theorem  1  (Zyskind,  1967).   A  necessary  and  sufficient 
condition  for  all  simple  least  squares  linear  estimators 
to  be  also  best  linear  unbiased  estimators  of  the 
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corresponding  estimable  parametric  function  X'g  in  the 
linear  model 


Y  =  Xg  +  e,  E(e)  =  0,  E ( e e  '  )  =  a^V, 


where  V  is  a  symmetric  non-negative  matrix  and  X  is  of 
rank  r,  is  that  there  exist  a  subset  of  r  orthogonal 
eigenvectors  of  V  which  forms  a  basis  for  the  column 
space  of  the  matrix  X. 


In  a  second  theorem,  Zyskind  (1967)  gives  several  other 
necessary  and  sufficient  conditions  for  ordinary  least 
squares  estimators  to  be  BLUE.   These  conditions  are  shown 
to  be  equivalent  to  the  condition  in  Theorem  1  (Zyskind, 
1967).   The  fifth  of  these  conditions  in  Zyskind 's  second 
theorem  is  that  VP  =  PV,  where  P  =  X(X'X)~X'. 

Applying  condition  5  of  Theorem  2  (Zyskind,  1967)  to 
our  problem  of  regressing  W  on  X^   we  have 


^  =  ^0 


and 


p  =  x^(x„'V  ^ 


and  therefore 
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^^  =  ^o^w(Vw^  ^ 


^w^Vw^  ^  ' 


since  E^X^  =  S^E^X  =  E^X  =  X^.   It  follows  that 


VP  =  X^(X^„)  X^Eq 


=  PV. 


Therefore  by  Theorem  2  (Zyskind,  1967)  the  ordinary  least 
squares  solutions  from  regressing  W  on  X^  are  BLUE 
estimators,  and  thus  are  equivalant  to  the  solutions 
obtained  from  weighted  least  squares.   We  conclude  therefore 
that  SSE^  =  SSE|^(  weighted  ) . 

4,6   The  Test  Statistic 
As  stated  in  Section  4.1,  Shillington  (1979)  proposed 
that  a  fitted  model  be  tested  for  lack  of  fit  by  using  the  F 
ratio 


MSB„ 
F  =      ^ 


MSE^ 


given  in  Eq.  (4.2).   In  this  section  we  shall  verify  that 
Shillington ' s  F  ratio  possesses  a  central  F  distribution 
when  the  true  model  is  of  the  form  in  Eq.  (4.3),  and 
possesses  a  doubly  noncentral  F  distribution  when  the  true 
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model  is  of  the  form  in  Eq.  (4.4).   This  information  on  the 
distribution  of  the  F  ratio  will  be  needed  in  Section  4.7, 
where  the  power  of  the  test  is  discussed.   Additionally,  we 
shall  give  the  form  of  the  expected  values  of  both  the 
numerator,  MSEg,  and  the  denominator,  MSE^,  of  the  F  ratio 
in  Eq.  (4.2).   These  expected  values  will  aid  us  in 
developing  a  procedure  for  calculating  the  power  of  the 
test,  since  they  will  be  used  to  determine  whether  the  test 
is  upper  tailed  or  lower  tailed. 

In  developing  the  distribution  of  the  F  ratio  in  Eq. 
(4.2),  we  shall  show  that  SSE^/a'^   and  SSE„/a^  are 
Statistically  independent.   In  this  pursuit,  let  us  use  the 
expression  for  SSEg  in  Eq.  (4.5)  and  the  fact  that  Y^  =  MY 
to  express  SSEg  as 


SSEg  =  Y'M'[Gq^-  Gq^X^(X^Gq^X^)"^X^Gq^]MY. 


Also,  using  the  expression  for  SSE„  in  Eq.  (4.6)  (which  is 
allowed  because  we  showed  in  Section  4.5  that  the  correct 
form,  SSE^^(weighted)  ,  is  equal  to  SSEy^)  and  using  the  fact 
that  W  =  IqY,    we  can  express  SSE^  as 


2^^W  =  r^ot^N  -  Ww)"^JS^ 


By  Theorem  4  (Searle,  1971,  p. 59),  to  show  that 

2  9 

SSEg/a   and  SSE^/a   are  statistically  independent,  it 
suffices  to  show  that  the  matrix  product 


120 


is  equal  to  the  zero  matrix.   This  is  seen  to  be  true  since 
MEq  =  0,  and  therefore  SSE^/a   and  SSE^/a^  are  independent. 

When  the  fitted  model  and  the  true  model  are  both  of 
the  form  in  Eq.  (4.3),  then  the  F  ratio  in  Eq.  (4.2) 
possesses  a  central  F  distribution  with  g  -  p  and  N  -  g  -  r 
degrees  of  freedom  in  the  numerator  and  denominator, 
respectively.   Furthermore,  the  numerator,  MSEg,  of  the  F 
ratio  in  Eq.  (4.2)  has  expectation  equal  to 


E(MSE3)  =  [oV(g  -  P)]EXg_p 


2 

=  a     . 


Similarly,  under  model  (4.3),  the  expected  value  of  the 
denominator,  MSE^^,  of  the  F  ratio  has  expectation  equal  to 


E(MSE^)  =  [0V(N  -  g  -  r)]Ex^_g_^ 


_   2 

=  a 


When  the  fitted  model  suffers  from  lack  of  fit  and  the 
true  model  is  given  by  Eq.  (4.4),  the  F  ratio  in  Eq.  (4.2) 
is  a  ratio  of  two  statistically  independent  noncentral  chi- 
square  random  variables,  each  divided  by  its  respective 
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degrees  of  freedom.   Thus  the  F  ratio  in  Eq.  (4.2)  possesses 
a  doubly  noncentral  F  distribution  with  g  -  p  and  N  -  g  -  r 
degrees  of  freedom  and  noncentral ity  parameters  n ,  and  n„, 
where  n  ^^  and  n   were  given   in  Eqs.  (4.7)  and  (4.8), 
respectively.   The  expected  value  of  the  numerator,  MSEg,  of 
the  F  ratio  can  be  written  as 


E(MSE3)  =  [aV(g  -  P)]Ex^l^^^^ 


=   a^    +   t^C^^^/iq    -   p) 


where 


^1  =  X2^*tG-^  -  Go\(^C^A)"'^cGo'^^^2-        (4.11) 

Similarly  under  model  (4.4),  the  expected  value  of  the 
denominator,  MSE^,  of  the  F  ratio  can  be  written  as 


E(MSE^)  =  [aV(N  -  g  -  r)]Ex^,? 


N-g-r  ,Il2 


=  a      +   0^C232/(N  -  g  -  r) 


where 


S  =  ^2^0f^N  -  ^W^^wV  ^^^0^2-  (4-12) 
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4.7   The  Testing  Procedure  and  its  Power 

As  discussed  in  Section  4.1,  Shillington  (1979) 

suggested  that  lack  of  fit  of  the  fitted  model  be  inferred 

when  the  value  of  the  F  statistic  in  Eq.  (4.2)  exceeds 

F      ^,     .   The  test,  however,  is  not  always  upper 
a;g-p,N-g-r  j         t-t- 

tailed,  and  in  fact  can  be  lower  tailed.   The  test  is 
considered  lower  tailed  when,  because  of  lack  of  fit,  the 
expected  value  of  the  numerator  of  the  F  ratio  is  less  than 
the  expected  value  of  the  denominator  of  the  F  ratio. 

We  suggest  that  lack  of  fit  be  tested  with  an  upper 
tailed  test  using  the  F  ratio  F  =  MSEg/MSE^  when  the  matrix 
D,  which  is  defined  as 


C  C 

1  2 

D  = —  -  -J. = (4.13) 

g-p        N-g-r 


is  found  to  be  positive  definite  (which  can  only  occur  when 
Ci  is  positive  definite,  by  Theorem  3.1  in  Appendix  4).   The 
matrices  C-,  and  C2  in  Eq.  (4.13)  are  defined  in  Eqs.  (4.11) 
and  (4.12),  respectively.   An  upper  tailed  test  is 
appropriate  v;hen  the  matrix  D  is  positive  definite  because 
no  matter  what  the  value  of  g   is,  the  expected  value  of  the 
numerator,  MSEg,  of  the  F  ratio  will  be  greater  than  the 
expected  value  of  the  denominator,  MSE^,  of  the  F  ratio. 
However,  there  may  be  cases  where  D  is  negative  definite 
(which  can  only  occur  when  C2  is  positive  definite),  and  in 
this  case  lack  of  fit  should  be  tested  with  a  lower  tailed 
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rejection  region.   If  D  is  indefinite,  then  the  F  test  for 
lack  of  fit  may  be  upper  tailed,  lower  tailed,  or  lack  of 
fit  may  not  be  testable  depending  upon  the  value  of  g  . 

In  those  cases  where  D  is  indefinite,  it  is  helpful  to 
write  the  quantity  S^D6_2^  which  represents  the  difference 
between  the  expected  value  of  the  numerator  and  the  expected 
value  of  the  denominator  of  the  F  ratio,  F  =  MSEg/MSE^,  as 


6^6.2  =  §.2["i*"2-"3l  diag[r^,  T^    =  0,  r  3  ]  [U^  :U2  103  ]  •  g  . 


=  ^^U^T^U[^^    +  ^F3r3U^i2' 


where  U-^,    U2,  and  U3  are  matrices  whose  columns  are 
orthonormal  eigenvectors  of  D,  and  r,,  r ^,    and  r^   are 
diagonal  matrices  whose  elements  are  the  positive,  zero,  and 
negative  eigenvalues  of  D,  respectively.   Lack  of  fit  is 
testable  with  an  upper  tailed  test  if  B      is  in  the  column 
space  of  [Uj^:U2],  but  not  entirely  in  the  column  space  of 
U2,  since  then  6^062  is  positive.   Similarly,  lack  of  fit  is 
testable  with  a  lower  tailed  test  if  g   is  in  the  column 
space  of  ["2:03]  ,  but  not  entirely  in  the  column  space  of 
U2,  since  then  S^D6  2  is  negative.   If  62  is  in  the  column 
space  of  U2f  then  lack  of  fit  cannot  be  tested,  since 
-2^-2  ^^"-'-'^  equal  zero. 
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We  define  g__  to  be  in  the  column  space  of  U2  if  the 
matrix  LIL   has  a  zero  eigenvalue,  where  L   =  [g  :U  ]. 
Similarly,  letting  L^    =  [^^:U^:U^]    and  L^  =  [025U2:U2], 
e   is  in  the  column  space  of  [u,:U2]  if  L'L„  has  a  zero 
eigenvalue,  and  g   is  in  the  column  space  of  [u  :U  ] 
if  L^L^  has  a  zero  eigenvalue. 

When  D  is  positive  definite  or  D  is  indefinite  but 
8   is  upper  tailed  testable,  then  the  F  test  for  lack  of  fit 
which  makes  use  of  the  F  ratio  F  =  MSEg/MSE^  is  a  test  of 
the  hypotheses  (see  Theorem  3.2,  Appendix  5) 


%'•      ^^="2  =  0 


"a*   "j^/Cg  -  p)  -  n2/(N  -  g  -  r)  >  0. 


When  D  is  negative  definite  or  D  is  indefinite  but  g   is 
lower  tailed  testable,  then  the  F  test  tests 


H„:  n,  =  n^  =  0 
0    12 


H,:  n,/(g  -  p)  -  n,/(N  -  g  -  r)  <  0, 


In  the  case  where  D  is  indefinite  and  g  is  in  the  column 
space  of  U2»  then  no  hypotheses  concerning  lack  of  fit  of 
the  fitted  model  can  be  tested. 

When  the  test  is  upper  tailed,  the  power  of  the  F  test 
for  lack  of  fit  is  defined  as 
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Power  =  p{f"     „     „   „   >  F      ^^     }        (4.14) 


where  F      „      is  the  upper  100a  percentage  point  of  the 
a;g-p,N-g-r 

central  F  distribution  with  g  -  p  and  N  -  g  -  r  degrees  of 
freedom.  In  the  case  of  a  lower  tailed  test,  the  power  of 
the  test  is  defined  as 

Power  =  P{F"     „     „   „   <  F,        „     }.  (4.15) 
^   g-p,N-g-r;ni ,n2     1-a ;g-p,N-g-r ^ 

4.8   Selection  of  Near  Neighbor  Groupings 
In  the  preceding  sections  of  this  chapter,  we  have 
discussed  a  near  neighbor  procedure  which  uses  the  F  ratio 
F  =  MSEg/MSE^  to  test  a  fitted  model  for  lack  of  fit.   In 
this  section  we  shall  investigate  the  effect  that  different 
groupings  of  response  observations  into  near  neighbor  cells 
have  on  the  testing  procedure  and  its  power.   From  equations 
(4.14)  and  (4.15)  in  the  previous  section  it  is  evident  that 
the  power  of  the  F  test  for  lack  of  fit,  which  makes  use  of 
the  F  ratio  F  =  MSE3/MSE^,  depends  on  the  values  of  the 
numerator  and  denominator  noncentrality  parameters, 
n,  and  II  „.   Assuming  the  numerator  and  denominator  degrees 
of  freedom  are  fixed,  and  the  test  is  upper  tailed,  then  the 
power  is  an  increasing  function  of  increasing  values 
of  II   and  is  a  decreasing  function  of  increasing  values  of 
II   (see  Appendix  1).   When  the  test  is  lower  tailed,  the 
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power  is  an  increasing  function  of  n _  and  is  a  decreasing 
function  of  n  . 

Since  both  the  numerator  and  denominator  noncentrality 
parameters,  n   and  U    ,    are  functions  of  the  groupings  of  the 
data  points  into  near  neighbor  cells  (as  are  the  numerator 
and  denominator  degrees  of  freedom),  we  would  like  to 
investigate  the  effect  of  the  number  and  composition  of 
cells  on  the  power  of  the  F  test.   Intuitively,  it  would 

seem  that  homogeneous  near  neighbor  cells  would  minimize  the 

2 
bias  inherent  in  estimating  a      with  MSE^^,  and  thus  would 

minimize  R      and  maximize  the  power  of  an  upper  tailed 

test.   However,  any  grouping  of  the  data  points  would  also 

influence  the  numerator  noncentrality  parameter  and  the 

numerator  and  denominator  degrees  of  freedom.   Therefore 

while  a  grouping  of  data  points  into  homogenous  cells  might 

decrease  II  „  and  thus  apparently  increase  the  power  of  an 

upper  tailed  test,  the  result  of  the  grouping  on  the  power 

also  depends  on  how  the  degrees  of  freedom,  g  -  p  and 

N  -  g  -  r,  and  the  numerator  noncentrality  parameter,  IT,, 

are  affected. 

We  will  attempt  to  find  homogeneous  cells  of  near 

neighbor  points  by  using  an  iterative  partitioning 

clustering  algorithm.   Two  examples  will  be  presented.   The 

first  example  makes  use  of  the  stack  loss  data  presented  by 

Daniel  and  Wood  (1971)  and  later  analyzed  by  Shillington 

(1979).   The  second  example  involves  data  from  a  mixture 

experiment  discussed  by  Piepel  (1981). 
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Our  objective  is  to  investigate  the  effect  of  forming 
homogeneous  cells  of  near  neighbors  on  the  F  test  for  lack  of 
fit  which  makes  use  of  the  test  statistic  F  =  MSEg/MSE^. 
It  is  hoped  that  such  homogeneous  groupings  will  increase 
the  power  of  the  F  test  (assuming  that  the  rejection  region 
is  upper  tailed)  by  decreasing  n      for  a  fixed  number  of  near 
neighbor  cells  (and  thus  fixed  values  for  the  degrees  of 
freedom).   The  effect  that  homogeneous  grouping  has  on  n ,  is 
not  clear,  but  is  of  interest,  since  the  magnitude  of 
n ,  also  affects  the  power  of  the  test.   Additionally,  we 
will  vary  the  number  of  cells  of  near  neighbors  in  an 
attempt  to  determine  how  this  affects  the  power  of  the  test, 
since  the  number  of  cells  affects  both  the  noncentrality 
parameters  and  the  degrees  of  freedom. 

The  algorithm  used  for  grouping  the  data  points  into 
homogeneous  near  neighbor  cells  can  be  described  as  an 
iterative  partitioning  type  of  cluster  analysis.   The 
computations  involved  were  accomplished  using  the  RELOC 
procedure  available  in  the  CLUSTAN  IC  computer  package 
(Wishart,  1975).   All  computations  were  performed  using  data 
points  whose  coordinates  were  standardized  by  subtracting 
off  sample  means  and  dividing  by  sample  standard  deviations. 

Initially,  k  clusters  (near  neighbor  cells)  of  the  N 
data  points  in  the  factor  space  are  arbitrarily  defined. 
Then  the  Euclidean  distance  between  each  point  and  the 
centroid  (average  vector  value)  of  each  of  the  k  clusters  is 
determined.   If  a  point  is  found  to  be  closer  to  the 
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centroid  of  one  of  the  other  k  -  1  clusters  than  to  the 
centroid  of  the  cluster  in  which  it  is  currently  classified, 
then  the  point  is  reclassified  into  that  nearest  cluster 
(cell).   The  centroids  of  the  k  clusters  are  then 
recalculated,  taking  into  account  any  reclassified  point. 
The  entire  set  of  N  points  is  scanned  repeatedly  in  this 
manner  until  no  reclassification  occurs.   This  method  of 
assigning  points  to  clusters  will  be  referred  to  as 
iterative  relocation. 

In  the  second  stage  of  the  algorithm,  two  of  the  k 
clusters  arrived  at  by  the  iterative  relocation  procedure 
are  fused,  resulting  in  k  -  1  clusters.   The  two  clusters  to 
be  fused  are  selected  as  those  which  when  fused  produce  the 
k  -  1  clusters  with  minimum  "error  sum  of  squares."   The 
error  sum  of  squares  is  defined  as  the  sum  of  squared 
Euclidean  distances  between  every  point  and  the  centroid 
point  of  the  cluster  to  which  it  belongs.   After  k  -  1 
clusters  are  determined   using  the  error  sum  of  squares 
criterion,  iterative  relocation  is  applied  to  the  k  -  1 
groups  in  an  effort  to  improve  the  clusterings.   This 
alternation  of  fusion  and  iterative  relocation  continues  for 
k  -  2  clusters,  k  -  3  clusters,  ...,  2  clusters,  or  until  a 
specified  minimum  number  of  clusters  is  reached.   The 
question  of  determining  an  "optimal"  number  of  clusters  is 
not  addressed  by  this  procedure. 
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4.8.1  Example  1 — Stack  Loss  Data 

The  first  example  we  investigate  is  the  21  observation 
stack  loss  data  of  Daniel  and  Wood  (1971),  which  was 
analyzed  by  Shillington  (1979).   The  data  (see  Table  8) 
consist  of  the  values  of  three  factors,  x-^    (air  flow),  Xj 
(cooling  water  inlet  temperature),  and  X3  (acid  concen- 
tration) along  with  the  values  of  a  response  variable,  Y 
(stack  loss).   A  first  order  regression  equation  of  the  form 


E(y)  =  3o  +  B^x^  +  ,^x^    -.  33X3 


is  fitted  using  17  of  the  original  21  observations 
(Shillington  discarded  4  of  the  original  21  observations  as 
outliers).   We  assume  the  true  model  to  have  the  form 


E(Y)  =  3o  +  B^XjL  +  32X2  +  B3X3  +  3;l1^J  ^  ^22^2  "^  ^33^3 


and  thus  contains  P2  =  3  terms  in  addition  to  the  p  =  4 
terms  in  the  fitted  model.   We  wish  to  investigate  the 
capability  of  the  F  test 


MSE 
F  =      B 


M^ 


in  detecting  lack  of  fit  of  the  fitted  model. 

We  first  consider  the  use  of  the  F  statistic  with  the 
six  cells  of  near  neighbors  used  by  Shillington  (1979), 
which  is  the  same  near  neighbor  grouping  suggested  by  Daniel 
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and  Wood  (1971).   This  6  cell  grouping  (see  Table  8  under 
the  column  heading  "6**")  is  found  to  yield  a  matrix 
D  =  [Cj^/(g  -  p)]  -  [C2/(N  -  g  -  r)]  which  is  indefinite, 
since  the  eigenvalues  of  D  have  the  values  12110/  -7,  and 
-1415  (see  Table  9).   Thus  the  test  is  not  upper  tailed, 
since  D  must  be  positive  definite  for  an  upper  tailed  test 
to  exist  for  all  values  of  ^j,    where  in  this  example, 

^2  =  ^^11'  ^22'  ^33^'* 

When  the  6  cell  grouping  of  near  neighbors  generated  by 

the  iterative  partitioning  clustering  algorithm  (see  Table  8 

under  the  column  heading  "6")  is  used,  the  values  of  the 

eigenvalues  of  D  are  49090,  379,  and  -43,  and  the  test  is 

still  not  upper  tailed  for  all  values  of  3_. 

We  then  use  the  iterative  partitioning  clustering 
algorithm  to  determine  homogeneous  cell  groupings  for  5,  7, 
8,  9,  10,  11,  and  12  cells.   The  matrix  D  is  found  to  be 
indefinite  for  the  groupings  into  5,  7,  or  8  cells,  but  D  is 
positive  definite  for  9,  10,  11,  or  12  cells  of  near 
neighbor  groupings.   Thus,  no  matter  what  the  value  of  g_2' 
lack  of  fit  can  be  tested  with  an  upper  tailed  test  using 
the  9,  10,  11,  or  12  cell  groupings  of  near  neighbors. 

The  value  of  F  =  MSEb/MSE^  was  calculated  using  the 
matrix  procedure  from  the  1979  version  of  SAS.   None  of  the 
near  neighbor  groupings  provided  evidence  of  lack  of  fit, 
and  thus  we  cannot  conclude  that  there  is  lack  of  fit  when 
the  fitted  model  is  E(Y)  =  e   +  3,x   +  ^j^o    "*"  ^3^3 
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and  the  true  model  contains  only  pure  quadratic  terms  in 
addition  to  the  first  degree  terms. 

For  the  groupings  of  near  neighbors  into  cells  which 
provide  an  upper  tailed  test  (9,  10,  11,  or  12  cells),  the 
power  of  the  upper  tailed  test  can  be  approximated  if  n , 
and  II   can  be  specified.   This  approximate  power  can  be 
calculated  using  an  approximation  to  the  doubly  noncentral  F 
distribution  given  in  Johnson  and  Kotz  (1970,  p. 197),  which 
is  described  in  Appendix  6  of  this  dissertation.   Thus  we 
calculate  an  approximation  for 


P(  F"  >  F  I 

^  g-p,N-g-r;ni ,n2     a;g-p,N-g-rJ  * 


In  order  to  compare  the  power  of  the  upper  tailed  F 

test  for  9,  10,  11,  and  12  cells,  we  will  assume  arbitrarily 

that  the  true  value  of  the  parameter  vector  B-  ^^ 

^2  ""  ^^11'  ^22'  ^33^'  ^  (.044,  .329,  -  .033)'  which  is 
arrived  at  by  taking  e„  =  lOg^  where  3   is  the  least  squares 

estimate  of  g„  calculated  from  the  data.   Furthermore, 

2 
taking  a      =1.6  (since  the  residual  mean  square  value  from 

fitting  the  "true"  second  degree  model  is  MSE  =  1.6)  we 

2 
calculate  the  values  for  n ,  =  3AC,3-/2a    and 

II2  =  0^020  2/20^  for  each  of  the  9,  10,  11,  and  12  cell 

groupings.   The  calculated  values  of  H   and  n  ,  as  well  as 

the  approximate  power  values  for  each  of  the  four  F  tests 

(calculated  using  the  approximation  to  F"  from  Johnson  and 

Kotz  (1970,  p. 197))  are  presented  in  Table  9.   The  power  is 
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quite  high  {>     .96)  for  9,  10,  or  11  cell  groupings,  but 
drops  off  to  .79  for  the  12  cell  grouping.   This  drop  in 
power  seems  to  be  due  to  the  effect  of  having  only  3  degrees 
of  freedom  in  the  denominator  of  the  F  ratio. 

In  summary,  this  example  illustrates  that  the  F  test 
for  lack  of  fit  that  makes  use  of  the  statistic 
F  =  MSEg/MSE^  is  upper  tailed  only  for  certain  groupings  of 
the  design  points  into  near  neighbor  cells.   For  the  near 
neighbor  cell  groupings  that  provide  an  upper  tailed  test, 

the  power  is  generally  high  for  the  values  of  6_  and 

2 
a   that  we  selected,  but  decreases  slightly  as  we  move  from 

9  to  10  to  11  cells  and  decreases  more  severely  as  we  move 

from  11  to  12  cells.   This  more  severe  decrease  in  power  is 

due  to  the  decrease  to  only  3  denominator  degrees  of 

freedom. 

4.8.2  Example  2 — Glass  Leaching  Data 

The  second  example  we  investigate  is  one  in  which  the 

leachability,  Y,  of  glass  is  assumed  to  be  a  function  of  the 

proportions  of  eleven  chemicals  of  which  the  glass  is 

composed  (Piepel,  1981).   A  first  order  Scheffe  polynomial 

model  was  fitted  to  the  common  logarithms  of  the 

leachability  values,  that  is,  the  fitted  model  is  of  the 

form 


11 


E(log   Y)  =  Z      g.x. 
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The  experimental  design  coordinates  and  the  values  of 
the  44  data  observations  are  presented  in  Table  10.   For 
illustrative  purposes,  the  true  model  is  assumed  to  contain 
the  P2  =  8  second  order  cross  product  terras, 

^es^'e^'s'  ^3,ii''3''ii'  ^79''7''9'  Se'^s^'e'  ^7,ii''7''ii' 

g-_x.x_,  6-  in^c^in'  ^^^    ^cQ^c^^Q  ^^   addition  to  the  p  =  11 
15  1  5    6,10  6  10         59  5  9 

first  order  terms  in  the  fitted  model.   The  19-term  model  is 
the  final  fitted  model  proposed  by  Piepel  and  serves  as  our 
true  model. 

Piepel  (1981)  suggested  that  the  four  sets  of 
observations  (see  Table  10) 

(a)  14  and  15 

(b)  18,  19,  and  20 

(c)  25,  26,  and  27 

(d)  39,  40,  and  41 

were  intended  to  constitute  four  cells  of  replicate 
observations  for  use  in  estimating  pure  experimental 
error.   However,  the  settings  of  the  mixture  components  were 
not  well  controlled,  so  that  each  of  the  four  cells 
contained  near  neighbors  rather  than  replicates.   By 
defining  each  of  the  remaining  33  data  points  as  33  cells 
containing  one  observation  each,  the  44  data  observations 
are  partitioned  into  37  cells.   If  we  choose  to  use  the  37 
cells  to  test  the  fitted  model  for  lack  of  fit,  with  the 
test  statistic  F  =  MSEg/MSE^,  we  find  that  there  are  no 
degrees  of  freedom  for  MSE^  so  that  the  F  statistic  cannot 
be  calculated  in  this  case.   However,  the  F  statistic  for 


136 


* 

x: 

OJ 

>^ 

o 

3 

(0 

I-H 

OJ 

nj 

J 

> 

i-H 

o 

l-t 

•H 

X 

z 

o 

m 

rH 

o 

X 

0) 

Ix, 

<Ti 

00 

X 

O 

• 

u 

(0 

4J 

00 

CN 

(C 

X 

O 

a 

•H 

Eh 

tji 

c 

•H 

r- 

O 

x: 

X 

c 

o 

N 

03 

QJ 

J 

vo 

o 

U 

X 

CN 

CO 

IT3 

(0 

2 

iH 

O 

in 

o 

X 

CP 

• 

S 

o 

rH 

dJ 

•<r 

o 

rH 

X 

(0 

Xi 

u 

(0 

&H 

m 

ro 

X 

o 

CN 

O'd'rHrHOOOOi— ioocNOO'^oou30i«Dor^oir)LnoomcO'>3' 

^1— IOO'^<yi^O^^LnTrV£)CNCNCN(NOOOOOOCN^'^^rHOM^O 
00OOOr-r0^(TiOr00>O<TiOOvDCNOOOO^>X>OOOOOCvJ 

G^  ^        ro  ro  •— t 


Lnooooo<Tiooooor~ooooooor~^ 

CNOOOCNCNOOOOOCNOCNCNCNCN^ 

oooooooooooooooooo 


•^•^orroNOooor^o 

^.HOCNfHO    OOOCNO 

oooooo  ooooo 


cTiOooocTirHOovoinovotnincDOOLn  iriLnor^cNrH  ooo^vo 

CV)OOCNO0noO(NCNCNOCNCNCNO   O^    .-H^OCNCN-H    OOOrnOJ 

oooooooooooooooo  oo  oooooo  ooooo 


oo-Hooooooor-cnooooorHin 

Of^OOOOCNOOOCNCNOmOOOOm^ 

oooooooooooooooooo 


■^  '3'   ■— I  O   00  O 
^  r-H  ro  O  CN  O 

oooooo 


CO  rr  CN   O  — I 

ro  ro  m  o  m 

OOOOO 


(Ti-HOOOOOOOOOOCnOOO 

vor~r-r>-r^ooor-oo<^v£)Ooo 

OOOOOOOOOOOOOOOO 


^  ID  •«a<'3<OrHm<Ti 
inm  roror~r^r-r~ 
oo  oooooo 


ooooo 
ooooo 
ooooo 


aNOMrHOr0CNO.HOOOOOO(T\O^ 

ino^i)voovovovDvDooooooiriix)oo 

OOOOOOOOOOOOOOOOOO 


OOOOOm    OOOCTiiHO 

cMooooov£)i>otriLnki3o 
oooooo  ooooo 


in'rrnoooo<TiCNLn^HO'H^<TiOCNinfNm<Tir~fooo(TiVD'^rorHvDon 
i^LninoinLDLnooirikriMOi— i^oo(NCNCNvDOLnLn<Ti(Ti(T>'^'-H 

^rH^M'H^^'HrH^M^,H^<H^^^     ^rHM^^i— I     OOOMrH 


ooo<Ti(Tvomoor~oovoooooMoo(T»(Ti'>D'roovoa>cNooo 
ooor~r~OrHOoor»r-r-ooooonrorocnrooooo  oor-ooooo 
ooooooooooooooooooooooooooooo 


00000000(TiOO<TiCv)00 

of^ooa>'*mrO(Ti'-H^o 

O-HOOO^rHMOOOO 


r-Ln'>!r'<j"  or~  vor^or*- 
fNromoo  o^  '^  y£>  ^  f-i 

^^^.-H   OO    OOO^ 


(^  O    O  O  O   O  (N 

•^  o  o  o  o  o  r^ 
o  o  ooooo 


ro 


lOOCNOOLDTrOOOmvOOOOOOmCNCNCNOPOLnOrO^mOf^ 

^oo^oO'r'a'0'^'3''^'a'Ooo  rrr-  r~r~or--Lno  cncmcn'"^ 

-HOOOOOrHi— lOi— I'H-H'HOOO'HOOOOOi— lOOOO 


o 
o 


in 


fN 

ro 

X 

o 

CN 

CD 

rH 

CN 

X 

O 

C/3 


mroO'HvO'VO'^O'jr 

VDVOCNVOCNVOVOCNCNi— I 

OOMOrHOOrH^rH 


.— (OOOCNmvDOOCNrH^inOO'^'H  r~>£)0000'* 
CNiX)infNCN>i>— HtTiCTiCTitN^VD"*  CMCNCNVOVO 
rHO   Oi— li-HO   -HO    OOi— lOO-H    -H^rHOO 


oo^ini— lCNvoo^a^^o^~^~•0'3'Olnoo^n^.D^--^~lnoof*^o^-->£>lno^'H 
Ti'^vooorooocNCN'^'3'CNLn  nc3or~-<Tim<Ti  cricTivccNrom  r~oooor~-^ 
r}<tnLnin'^'3'^'«3''r'a''a'Lr)^LnLr)^  'T'*  ^■^m^Tin  inminLnin 


c 
o 

•iH 

(0      0) 


> 

w 
o 


X! 

e 

2 


(0(0  X5JDX!  UOO 

OrHCN^o^lnvor-ooo^o^cNro'*^nl^o^-ooo^ 

1— ICNrn^invor-OOCTiM^'H'HrH^^^i— I^CNCNCNCNCNCN    CNCNICNCN 


0'^o^o<Noooa^  oooo  oor^ 
oooor^oocNrnvDoovO'^'ro'^ro 

rH<TiCNrroor-~r>->^(^Ln>x)rnLno 
in  CN  n       cnr^r^r^'^cN 


137 


^ocNor--ooo 

CNOCMOCNOOO 

oooooooo 


LD  1.D  VO  ■—(  m  ^  ITl 
CN  CN  CM  CN  (N  CN  CN 
O  O   O  O    O  O   O 


>^ 

o 

—( 

0 


ooooounoo 

OOOOOCNOf^ 

oooooooo 


o  o  o  o  o  o  o 

ro  O  O  O   O  O  O 

o  o  o  o  o  o  o 


^OrOO-^OOOOOOOOrH-* 
mOrooroOOOOOOOPOrOCN 
OOOOOOOOOOOOOOO 


O'^r^-'^'r^ootN  oo 
or^r--r-r-~oor~oo 
oooooooooo 


o  o  o  o  r^ 
o  o  o  o  r^ 
o  o  o  o  o 


OOOOOOOOOCTlrHOrHOOm 
OLDvCOOOOLDinvDVOVOOOVO 
OOOOOOOOOOOOOOO 


c 
o 

a; 

E 

u 
o 

Vl 

Ou 

CO 

CO 

>1 

.—I 

c 
< 


v£3^Tj<inovDi— (o>CT\Ln'a<":rv^^on 

^nHfH-HM^-HOOM^rH-H^^ 


CO 
CO 
«3 


0) 
3 
C 
•H 
4J 

c 
o 
o 
I 
I 
o 


JQ 

m 

E-< 


ooorooomorooooooo 
oooocx3oocx)0  ooooo  oooo 
oooooooo  oooo  ooo 


OOOO'*C0V000<T»M— ICMrHr^'H 
MrOOO^CNCNf^CNrOf^m,HCN.-H 
Oi-HOOi— l^'HO^^i— Ir-HOrHO 


CN^CNOOOMCNCvJCNir-OJcTirH'^O 

LnrO'3<roorO'3''^^cNmroLn^o 

I— lO^-HOOrH^    OfHrHrH    Ml— lO 


r--ror~~o>^r^r^'«j'-<Tcx3r~»r^r~mQO 
CNooisDmvo^i^roro>^i^vocNmi^ 

Ml— lOMOOOrHrHOOO-H— lO 


r^cyiTO^'^'^rocNjTrcNfHooo*^ 
MCNincNr-covoronmrorOi— (rocT( 


T3  T^  TJ 
OrHCNrOTl'Ln^Dt^(X5cyiO'-(fNn'<* 


CO 

U-l    0) 

O  4J 
(0 

to  O 

CO  -H 
O  -H 

rH  a. 
a* 

c 

O   (U 
0)    c 

Qj  a; 

■u 

-u  c 

CP 

•H  y-i 
O;   O 

CO 

0)   Oi 


3 
(0  o 

CO  en 

3    ij 


(0 


(0     ^ 


(tj 


138 

lack  of  fit  can  be  calculated  using  from  12  to  33  cells  of 
near  neighbors,  and  thus  we  refer  to  the  iterative 
partitioning  clustering  algorithm  discussed  earlier  in  this 
section  to  generate  near  neighbor  groupings  of  15,  20,  25, 
and  30  cells  (see  Table  11). 

The  clusterings  of  observations  into  15,  20,  or  25 
cells  each  yields  a  D  matrix  that  is  indefinite  (see  Table 
12),  so  that  the  test  is  not  upper  tailed.   The  30  cell 
clustering  produces  a  positive  definite  D  matrix,  so  that 
the  test  is  upper  tailed  for  all  nonzero  values  of  g_,  where 
3.2  =  (egg,  33^3^3^,  3-79,  35g,  Sy,!!'  3j^5,  3g^3^Q,  359)'. 
Taking  3.2  =  (15.141,  -112.429,  -78.761,  -78.275,  87.996, 

13.356,  -76.948,  34.721)',  which  is  the  least  squares 

2 
estimate  of  3^  from  the  data,  taking  a   =  .008  (which  is 

MSEp,jj.g  with  seven  degrees  of  freedom  from  Piepel's  analysis 

of  the  data),  and  using  the  approximation  of  Johnson  and 

Kotz  (1970,  p. 197)  to  approximate 


Pf  F"  >  F         1 

^  19,4;ni  ,Jl2  .05;19,4^ 


where  n,  =  9.79,  n^  =  0.08  and  F  ^_  ,^  ^  =  5.81,  we  find 
1  2  . 05;19,4 

that  (using  30  near  neighbor  cells)  a  value  for  the  power  of 
the  F  test  is  .10.   The  power  increases  as  the  magnitudes  of 
the  elements  of  3  -  are  increased,  so  for  example  if  all  the 
elements  of  3 -,  above  are  doubled,  then  n ,  =  39.16, 
H _  =  0.32,  and  the  approximate  power  is  .25.   If  the 
elements  of   3_~  above  are  each  multiplied  by  5,  then 
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Table  11.   Near  Neighbor  Cells  for  Glass  Leaching  Data. 


Membership  in  Near 
Observation   Neighbor  cells* 
Number      15   20   25   30 


1 

2 

3 

4 

5 

6 

7 

8 

9 

10 

11 

12 

13 

14 

15 

16 

17 

18 

19 

20 

21 

22 


1 

2 

3 

4 

5 

6 

7 

8 

9 

10 

10 

11 

12 

13 

13 

6 

14 

11 

11 

11 

3 

5 


1 

2 

3 

4 

5 

6 

7 

8 

9 

10 

10 

11 

12 

13 

13 

6 

14 

15 

15 

15 

3 

5 


1 

2 

3 

4 

5 

6 

7 

8 

9 

10 

10 

11 

12 

13 

13 

14 

15 

16 

16 

16 

3 

5 


1 

2 

3 

4 

5 

6 

7 

8 

9 

10 

10 

11 

12 

13 

13 

14 

15 

16 

16 

16 

3 

17 


Membership  in  Near 
Observation   Neighbor  Cells 
Number     15   20   25   30 


23 
24 
25 
26 
27 
28 
29 
30 
31 
32 
33 
34 
35 
36 
37 
38 
39 
40 
41 
42 
43 
44 


1 

3 

15 

15 

15 

4 

12 

10 

8 

1 

10 

2 

13 

11 

8 

9 

7 

7 

7 

10 

14 

2 


1 

3 

16 

16 

16 

4 

12 

17 

8 

14 

18 

2 

19 

11 

8 

9 

7 

7 

7 

17 

20 

2 


1 

3 

17 

17 

17 

18 

12 

19 

8 

15 

20 

2 

21 

22 

23 

9 

7 

7 

7 

19 

24 

25 


18 
19 
20 
20 
20 
21 
12 
22 
23 
15 
24 
2 
25 
26 
27 
28 
7 
7 
7 
22 
29 
30 


*Cell  groupings  generated  by  an  iterative  partitioning 
cluster  analysis  using  the  CLUSTAN  computer  package. 
Numbers  in  the  table  refer  to  cell  membership. 
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n  j^  =  244.75,  n   =  2.00  and  the  approximate  power  is  .83. 
From  the  entry  in  Table  12,  we  see  that  the  calculated  F 
value  of  20.55  with  the  30  cell  clustering  exceeds 
^  ni;.iQ  A   ~    5.81,  and  we  conclude  that  the  fitted  first 
degree  model  is  inadequate. 

4.9  Discussion 

When  a  designed  experiment  includes  replicated  points, 
the  adequacy  of  a  fitted  model  can  be  tested  by  comparing 
the  portion  of  the  residual  sum  of  squares  due  to  lack  of 
fit  to  a  second  portion  due  to  pure  error  from  the 
replicates.   The  test  statistic  is  an  F  ratio  of  the  mean 
square  due  to  lack  of  fit  to  the  mean  square  due  to  pure 
error,  and  lack  of  fit  is  inferred  when  the  calculated  value 
of  this  ratio  is  large  (Draper  and  Smith,  1981,  p. 120). 

When  replicate  points  do  not  exist,  lack  of  fit  can  be 
tested  using  near  neighbor  observations  with  the  test 
statisic  F  =  MSEg/MSE^.   This  F  ratio  has  been  shown  to 
possess  a  central  F  distribution  when  the  fitted  model  is 
adequate,  and  a  doubly  noncentral  F  distribution  when  the 
fitted  model  suffers  from  lack  of  fit. 

When  the  fitted  model  is  adequate,  the  expected  values 
of  both  MSEg  and  MSE^  are  equal  to  a^,  so  that  the  ratio 
E [MSEg]/E[MSE^]  equals  unity.   However,  when  lack  of  fit  is 
present,  both  MSEg  and  MSE^^  are  biased  estimates  of  a^,  and 
we  compare  the  magnitudes  of  the  biases  of  these  estimates 
(which  are  functions  of  the  noncentrality  parameters  and 
degrees  of  freedom  of  the  doubly  noncentral  F  distribution) 
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in  the  F  test.   The  test  has  an  upper  tailed  rejection 
region  if  the  bias  corresponding  to  MSEg  exceeds  the  bias 
corresponding  to  MSE„.   The  rejection  region  is  lower  tailed 
if  the  bias  corresponding  to  MSE^  exceeds  the  bias 
corresponding  to  MSEg.   In  other  words,  the  test  is  upper 
tailed  if  the  matrix  D  (see  Eq.  4.13)  is  positive  definite, 
and  the  test  is  lower  tailed  if  D  is  negative  definite.   If 
D  is  indefinite  then  the  test  may  be  upper  tailed,  lower 
tailed  or  still  yet  lack  of  fit  may  not  be  testable 
depending  upon  the  value  of  3  . 

In  two  examples  an  iterative  partitioning  clustering 
algorithm  is  used  to  assign  the  data  points  to  a  preselected 
number  of  near  neighbor  cells.   When  the  number  of  cells  is 
low,  the  matrix  D  is  found  to  be  indefinite,  so  that  the  F 
test  is  not  strictly  upper  tailed  or  lower  tailed.   However, 
by  increasing  the  number  of  cells,  it  is  possible  in  both 
examples  to  produce  a  positive  definite  matrix  D,  so  that 
the  test  is  upper  tailed. 

Increasing  the  number  of  cells  not  only  produces  an 
upper  tailed  test,  but  also  affects  the  values  of  the 
parameters  of  the  doubly  noncentral  F  distribution.   As  the 
number  of  cells  is  increased  (moving  from  left  to  right  in 
Tables  9  and  12)  we  see  that  the  smallest  eigenvalue  of  C^ 
increases  and  that  the  largest  eigenvalue  of  C2  decreases. 
Therefore  a  lower  bound  for  n,  , 


^-2^2^min   ,  , 

-T—  ^  '^l 

2a 
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increases  as  the  number  of  cells  increases  (where  ib  . 

^min 

denotes  the  smallest  eigenvalue  of  Cj^).   In  addition,  an 
upper  bound  for  n 


2  ' 


8lB„p 
TT  c      -2-2  max 

^2  ^     _  2 
2a 


decreases  as  the  number  of  cells  increases  (where  p„=,„ 
denotes  the  largest  eigenvalue  of  C2).   Finally,  as  the 
number  of  cells  increases  (moving  from  left  to  right  in 
Tables  9  and  12),  the  numerator  degrees  of  freedom,  g  -  p, 
increase  and  the  denominator  degrees  of  freedom,  N  -  g  -  r, 
decrease.   Since  the  parameters  of  the  doubly  noncentral  F 
distribution  change  as  the  number  of  cells  changes,  the 
power  of  the  F  test  can  be  affected.   For  the  stack  loss 
data  example,  we  see  in  Table  9  that  the  power  of  the  upper 
tailed  test  decreases  as  we  move  from  9  to  10  to  11  to  12 
cells. 

An  area  for  future  study  can  be  a  further  investigation 
of  the  effect  of  the  number  and  composition  of  near  neighbor 
cells  on  the  power  of  the  F  test  which  makes  use  of 
F  =  MSEq/MSE^.   This  investigation  would  involve  the  effect 
of  near  neighbor  cell  selections  on  the  parameters  II-,,  II-/ 
g  -  p,  and  N  -  g  -  r  of  the  doubly  noncentral  F  distribu- 
tion.  It  would  be  desirable  to  develop  a  method  (perhaps  an 
alternative  to  the  iterative  partitioning  clustering  algo- 
rithm) which  could  be  used  to  select  the  number  and  composi- 
tion of  cells  so  as  to  maximize  the  power  of  the  F  test. 


CHAPTER  FIVE 
CONCLUSIONS  AND  RECOMMENDATIONS 

Two  general  methods  for  testing  a  linear  model  fitted 
in  a  mixture  space  for  lack  of  fit  have  been  investigated  in 
this  dissertation.   The  first  method  makes  use  of  response 
values  observed  at  check  points  while  the  second  method 
makes  use  of  response  values  observed  at  design  points  which 
are  near  neighbors  in  the  factor  space. 

In  Chapter  Two  we  discussed  the  work  of  several  authors 
(Scheffe  (1958),  Gorman  and  Hinman  (1962),  Kurotori  (1966), 
and  Snee  (1971))  for  testing  lack  of  fit  which  centered  on 
measuring  bias  inherent  in  the  fitted  model  when  estimating 
the  response  at  check  points.   Only  the  method  suggested  by 
Scheffe  (1958)  was  an  exact  test.   In  Chapter  Three,  a 
method  for  selecting  check  points  that  maximizes  the  power 
of  Scheffe 's  F  test  was  devised.   When  replicate  response 

observations  were  available,  so  that  the  experimental  error 

"  2 
variance  could  be  estimated  by  a   ^  from  the  replicates,  we 

^      ext  ^ 

saw  that  the  power  of  this  upper  tailed  F  test  was  maximized 
by  selecting  check  points  that  maximize  (or  approximately 
maximize)  the  noncentrality  parameter  A,  of  the  noncentral  F 
distribution.   When  the  matrix  K^    (where  x,  =  3AA,3„/2a  ) 
was  found  to  be  positive  semi -definite  it  was  determined 
that  only  a  subset  of  possible  values  of  the  g   parameter 
vector  could  be  detected  as  contributing  to  lack  of  fit. 

145 


146 

When  an  estimate  of  the  experimental  error  variance  was 

not  available  from  replicates,  an  extension  of  Scheffe's  F 

test  for  lack  of  fit  which  replaced  n  with  MSE  (the 

ext 

residual  mean  square  error  from  the  fitted  model)  in  the 
denominator  was  developed.   We  found  that  to  maximize  the 
power  of  the  test  it  was  necessary  to  select  check  points  to 
maximize  (or  approximately  maximize)  the  numerator 
noncentrality  parameter,  X  ,  of  the  doubly  noncentral  F 
distribution  when  the  test  was  upper  tailed.   When  the  test 
was  lower  tailed,  we  sought  check  point  locations  that 
minimized  (or  approximately  minimized)  X  .   A  criterion  was 
developed  for  determining  whether  the  test  was  upper  tailed 
or  lower  tailed  by  comparing  the  expected  values  of  the 
numerator  and  denominator  of  the  F  ratio  when  the  fitted 
model  was  inadequate.   Finally,  we  discovered  cases  where, 
for  some  values  of  g  ,  lack  of  fit  could  not  be  tested. 

An  alternative  to  the  check  points  method  for  testing 
lack  of  fit  in  a  fitted  model  is  a  procedure  that  involves 
measuring  the  bias  that  is  present  in  estimates  of  the 
response  at  the  design  points  (the  number  of  design  points 
must  exceed  the  number  of  terras  in  the  model).   When 
replicate  observations  are  available,  the  well  known 
procedure  in  which  the  test  statistic  is  a  ratio  of  the  lack 
of  fit  mean  square  to  the  pure  error  mean  square  can  be  used 
to  test  for  lack  of  fit  (see  Draper  and  Smith,  1981, 
p.  120).   When  replicate  observations  are  not  available, 
several  techniques  which  make  use  of  near  neighbor 
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observations  in  place  of  replicates  for  testing  lack  of  fit 
have  been  proposed  in  the  literature  (see  Green  (1971), 
Daniel  and  Wood  (1971),  and  Shillington  (1979)). 
Additionally,  it  has  been  suggested  by  Draper  and  Smith 
(1981,  p.  42)  that  lack  of  fit  can  be  tested  by  using  near 
neighbor  observations  as  substitutes  for  replicate 
observations  in  the  usual  lack  of  fit,  pure  error  F  ratio. 
However,  the  exact  distributions  of  the  test  statistics 
proposed  by  Daniel  and  Wood  (1971)  and  Draper  and  Smith 
(1981,  p.  42)  have  not  been  defined,  and  Green's  (1971) 
procedure  requires  an  inordinately  large  number  of 
observations.   Thus  because  of  these  reasons  we  chose 
Shillington 's  (1979)  procedure  to  study  in  greater  detail  in 
Chapter  Four. 

In  Chapter  Four  the  distributional  properties  of 
Shillington 's  test  statistic  were  developed,  and  a  method 
based  on  an  iterative  partitioning  clustering  algorithm  for 
defining  groups  of  near  neighbor  observations  was 
proposed.   It  was  shown  that  the  power  of  Shillington ' s  test 
depends  on  the  parameters  of  the  doubly  noncentral  F 
distribution,  and  that  the  manner  in  which  observations  are 
grouped  as  near  neighbors  can  alter  the  values  of  the 
parameters  of  the  doubly  noncentral  F  distribution  and  thus 
affect  the  power  of  the  test.   We  found  that  increasing  the 
number  of  near  neighbor  cells  so  that  individual  cells 
become  more  compact  produced  an  upper  tailed  F  test  in  the 
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two  examples  studied,  but  that  there  are  many  other  cases 
where  the  test  will  not  be  upper  tailed. 

Now  that  we  have  briefly  summarized  our  findings  from 
investigating  the  check  point  and  near  neighbor  methods  of 
testing  lack  of  fit  in  a  mixture  model,  a  logical  question 
is,  "Which  of  the  two  methods  is  better?"   It  was  not  our 
original  intent  to  address  this  question  in  this 
dissertation,  but  an  interesting  result  that  has  been 
discovered  in  the  latter  stages  of  our  investigations  is  as 
follows:   Under  certain  circumstances,  the  check  point 
method  for  testing  lack  of  fit  is  equivalent  to  the  usual 
method  which  partitions  the  residual  sum  of  squares  into 
sums  of  squares  due  to  lack  of  fit  and  due  to  pure  error 
(which  was  shown  in  Chapter  Four  to  be  a  special  case  of  the 
near  neighbor  method).   Because  we  have  not  found  a 
derivation  of  the  equality  of  these  methods  in  the 
literature,  we  shall  show  it  here. 

In  Chapter  Three,  check  points  were  used  to  test  lack 
of  fit  in  a  fitted  model  of  the  form  E(Y)  =  Xg , .   With  k 
check  points,  the  test  statistic  was  of  the  form  (see  Eq. 
(3.3)) 


d'v/d/k 
F  = 


*2 
^'ext 


^2  2 

where  a   ^  is  an  external  estimate  of  a      which  can  be 
ext 
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calculated  from  replicates,  if  they  exist.   The  vector  d  in 
the  F  ratio  was  defined  to  be  a  vector  of  differences 
between  observed  and  predicted  response  values  at  the  k 
check  points  having  the  form 

d  =  Y*  -  X*(X'X)~'^X' Y, 


where  Y*  is  the  kx 1  vector  of  observed  response  values  at 

the  k  check  points  and  X*  is  the  corresponding  settings  of 

2 
the  model  terms  at  the  check  points.   The  matrix  a  V^  was 

defined  as  the  variance-covariance  matrix  of  d  where  Vq  has 

the  form 


V^  =  I,  +  X*(X'X)  -"-x*' . 
Ok 


It  can  be  shown  (see  ^pendix  13)  that  if  we  define  the 

vector  Y,  as 
-A 


^A  = 


Y 
Y* 


observations  at  the  check  points 


and  similarly  define  the  matrix  X^  as 


^A  = 


X 
X* 


design  P2iD^_§2^^i'^5£ 
check  point  settingi 


so  that  the  original  design  points  as  well  as  the  check 
points  are  all  taken  at  once  as  design  points  in  regressing 


Y-  on  X-,  then 
-A      A 
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SSE^  =  X^f^N+k)  -  ^a(^A^a)~^^A^Xa  =  ^'V-^d  +  SSE.    (5.1) 


Thus,  the  residual  sum  of  squares,  SSE^,  from  the  analysis 
of  the  fitted  model  when  both  the  original  design  points  and 
the  check  points  are  used  to  fit  the  model  is  equal  to  the 
sum  of  the  quadratic  form,  d'V~  d,  used  in  the  numerator  of 
the  check  point  F  test  and  the  residual  sum  of  squares,  SSE, 
from  the  analysis  of  the  fitted  model  using  data  collected 
only  from  the  original  design  points. 

If  we  perform  the  usual  partitioning  of  SSE^  into  a 
lack  of  fit  sum  of  squares,  SSj^Qp/^x,  and  a  pure  error  sum 
of  squares  due  to  replicates,  SSEp^_jj.g(  j^j ,  then  from  Eq. 
(5.1)  we  can  write 


SSlOF(A)  -^  SSEp^re(A)  =  ^'^o'^-  +  S^^'       ^^.2) 


Thus  from  Eq.  (5.2),  when  SSE-^^j-^^^j  is  equal  to  SSE,  then 

SSlof(a)  becomes  equal  to  d'v"  d  so  that  the  check  point  F 

-1     *  2 
ratio,  F  =  (d'V„  d/k)/a   ^,  and  the  usual  lack  of  fit  F 
^-   0  -     ext 

ratio,  F  =  MS,^„,,,/MSE     ,^,,  are  equivalent.   We  now 
LOF(A)'^    pure  (A) 

present  an  example  to  illustrate  the  result  in  Eq.  (5.2). 

Let  us  fit  a  second  degree  Scheffe  polynomial  model  to 
the  following  hypothetical  or  artificial  response 
observations  collected  at  the  six  points  of  the  {3,2} 
simplex  lattice  design: 
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Y  = 


2350 
2370 
2450 
2430 
2650 
2670 
2400 
2420 
2750 
2730 
2950 
2970 


X  = 


1 

0 

0 

0 

0 

0 

1 

0 

0 

0 

0 

0 

0 

1 

0 

0 

0 

0 

0 

1 

0 

0 

0 

0 

0 

0 

1 

0 

0 

0 

0 

0 

1 

0 

0 

0 

5 

.5 

0 

.25 

0 

0 

5 

.5 

0 

.25 

0 

0 

5 

0 

.5 

0 

.25 

0 

5 

0 

.5 

0 

.25 

0 

0 

.5 

.5 

0 

0 

.25 

0 

.5 

.5 

0 

0 

.25 

The  model  is  Y  =  2360x,  +  2440x   +  2660x   +  40x  x   + 
920x,x-  +  1640x„x^.   Since  each  of  the  six  design  points  is 
replicated  twice,  there  are  six  degrees  of  freedom  available 

for  estimating  the  experimental  error  variance.   Let  an 

*  2 
estimate  of  the  error  variance  be  a   ^  =  MSE      = 

ext      pure 

SSE    /6  =  1200/6  =  200,  and  this  value  will  be  the 
pure         ' 

denominator  of  the  check  point  lack  of  fit  F  ratio. 

Let  us  choose  the  three  points  (2/3,  1/6,  1/6), 
(1/6,  2/3,  1/6),  and  (1/6,  1/6,  2/3)  as  check  points  and 
assume  that  we  have  observed  the  following  values  at  these 
points 


Y*  = 


2690 
2770 
2980 


X*  = 


2/3 

1/6 

1/6 

1/9 

1/9 

1/36 

1/6 

2/3 

1/6 

1/9 

1/36 

1/9 

1/6 

1/6 

2/3 

1/36 

1/9 

1/9 

The  numerator  of  the  check  point  lack  of  fit  F  ratio  is  then 

calculated  to  be  d'v"  d  =  24546.7,  so  that 

F  =  (d'VQ-'-d/k)/a^^^  =  (24546. 7/3)/200  =  40.91. 
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If  we  use  all  of  the  observed  response  values  to  fit 
the  second  order  Scheffe  polynomial,  then  the  model  is 
Y  =  2360. 7Xj^  +  2437.7x2  +  2661.7x3  +  183.3x^X2  +  1071.3x-|^X3 
+  1785.3X2X3  and  the  residual  sum  of  squares  is  SSE^  = 
25746.7.   This  residual  sum  of  squares  can  be  partitioned 
into  SSlof(A)  =  24546.7  and  SSEpuj^e(^)  =  1200.   The  F  ratio 
for  testing  lack  of  fit  is  calculated  to  be 
F  =  MSLOF(A)/MSEpure(A)  =  [  24546 . 7/3] / [1200/6]  =  40.91, 
which  is  identical  to  the  previously  calculated  F  value. 

In  the  above  example  we  note  that  SSEp^j.g/^j  is  equal 
to  SSE  (SSE  =  SSEpy^g(^p  so  that  SSlof(A)  ^^  equal 
to  d'V„  d.   Since  both  the  check  point  F  ratio  and  the  usual 
lack  of  fit  F  ratio  have  produced  the  same  value,  F  =  40.91, 
we  conclude  that  the  two  methods  for  testing  lack  of  fit  in 
the  fitted  model  are  equivalent. 

In  order  to  put  this  dissertation  in  a  better 
perspective,  we  now  make  some  concluding  remarks  on  the  lack 
of  fit  testing  procedures  investigated,  including  possible 
drawbacks,  extensions,  and  recommendations  for  future  work. 

An  aspect  of  our  investigations  that  may  raise  some 
questions  is  that  our  methods  are  dependent  on  the 
specification  of  the  form  of  the  true  model  believed  to  be 
responsible  for  lack  of  fit  in  the  fitted  model.   Requiring 
the  form  of  the  true  model  to  be  specified  was  necessary  in 
order  to  be  able  to  investigate  the  power  of  the  testing 
procedures.   There  are  situations,  however,  where  a  complete 
or  true  model  can  reasonably  be  specified.   One  example 
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could  be  in  fitting  polynomial  models,  where  the  polynomial 
of  one  degree  higher  than  the  fitted  model  could  be  taken  as 
the  true  model. 

We  now  mention  two  ways  in  which  our  results  can  be 
applied  to  more  general  situations  than  may  be  readily 
apparent  from  our  previous  discussions.   First,  we  point  out 
that  all  examples  in  Chapters  Three  and  Four  dealt  with 
polynomial  models.   This  type  of  model  was  selected  because 
of  its  popularity  and  v/ide  applicability,  however,  our 
methods  can  be  applied  not  only  to  polynomial  models  but  to 
any  models  which  are  linear  in  their  parameters.   Secondly, 
it  was  our  intent  in  this  dissertation  to  discuss  methods 
for  testing  lack  of  fit  in  a  mixture  model,  but  the  methods 
discussed  can  certainly  be  used  not  only  in  mixture  problems 
but  also  in  general  response  surface  problems  in  which  a 
linear  model  is  fitted.   This  generalization  is  illustrated 
for  the  near  neighbor  approach  to  lack  of  fit  testing 
through  the  stack  loss  example  in  Chapter  Four. 

Topics  for  future  research  stemming  from  this 
dissertation  were  listed  in  the  concluding  paragraphs  of 
Chapters  Three  and  Four.   One  area  suggested  in  Chapter 
Three  was  to  investigate  the  effect  of  experimental  design 
on  the  selection  of  check  points  and  on  the  resulting  power 
of  the  test.   Perhaps  a  "minimum  bias"  design  could  be  used 
for  fitting  the  model,  while  lack  of  fit  could  be  detected 
with  "high  bias"  check  points,  but  this  in  only  speculation, 
and  needs  to  be  investigated. 
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The  fact  that  the  check  points  method  and  the  standard 
method  that  partitions  the  residual  sum  of  squares  into  lack 
of  fit  and  pure  error  portions  were  found,  under  a  certain 
condition,  to  be  equivalent  suggests  that  selecting  check 
points  to  maximize  the  power  of  the  check  point  F  test  may 
in  general  be  equivalent  to  choosing  points  to  augment  the 
original  design.   The  augmented  points  would  be  chosen  to 
maximize  the  power  of  the  F  test  that  partitions  the 
residual  sum  of  squares  into  lack  of  fit  and  pure  error  sums 
of  squares.   An  investigation  of  the  selection  of  optimal 
check  points  versus  the  selection  of  optimal  augmented 
design  points  would  be  of  interest. 

For  the  near  neighbor  test  for  lack  of  fit  it  was 
recommended  in  Chapter  Four  that  other  methods  besides  the 
iterative  partitioning  clustering  algorithm  might  be 
considered  for  selecting  groups  of  near  neighbors.   The 
effect  of  the  number  and  composition  of  the  groups  selected 
on  the  power  of  the  test  through  their  effect  on  the 
parameters  of  the  doubly  noncentral  F  distribution  could 
then  be  investigated. 

In  view  of  the  equivalence  of  the  check  point  method 
and  the  method  that  partitions  the  residual  sum  of  squares 
when  replicates  exist  (see  Eq.  (5.2)),  it  would  be  of 
interest  to  investigate  whether  there  is  also  some 
equivalence  between  Shillington ' s  near  neighbor  F  ratio  and 
the  check  point  F  ratio,  F  =  (d'V~  d/k)/MSE,  to  be  used  when 
an  external  estimate  of  a   is  not  available.   If  the  methods 
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are  not  equivalent,  perhaps  one  could  be  shown  to  be 
preferable  to  the  other  as  judged  by  comparing  the  power  of 
the  two  procedures  in  testing  for  lack  of  fit. 

Finally,  the  focus  of  this  dissertation  has  been  on 
testing  lack  of  fit  in  linear  models  so  that  another  area 
for  future  investigation  can  be  the  problem  of  testing  lack 
of  fit  in  models  which  are  nonlinear  in  their  parameters. 


APPENDIX  1 
INFLUENCE  OF  X .  ON  p{  F"  >  F        1 

■■■        Vi,V2'^l/^2    a;vi,V2 

In  this  appendix  we  show  that  P{ F"  >  f       } 

is  an  increasing  function  of  X-^. 

Let  X  ,  ...,  X   ,  Y.,  ...,  Y   be  independent  N(0,1). 

i  V  1      i  V2  \       r       ' 

Then 

F  =  (V  /v  )[(X   +  x|/2)2  ^   J,   x2]/[(Y.  +  xy^)^  +   E   Y^] 

i=2        1     /       i=2   ^ 


is  distributed  as  F"  where  vi  and  v^  are  the 

respective  numerator  and  denominator  degrees  of  freedom  and 

XjL  and  X2  are  the  respective  numerator  and  denominator 

noncentrality  parameters  (Scheffe,  1959,  p.  412-413). 

Fixing  the  values  of  v-^,    v 21    and  X  2  we  wish  to  show 

p{F"      ■>   ,  >  f       1  is  a  strictly  increasing  function 
VlfV2;Xi,X2    a;vi,V2^  "^  ^ 

of  X  -1  ,  where  F        represents  the  upper  100a  percentage 
J-  a  ;vi  ,V2  r-r-  c-         ^ 

point  of  the  central  F  distribution  with  vi  and  V2  degrees 
of  freedom.   Let 


f(xl/2) 


^  .  ,  ,.  .     ^2 


=  P{  (v^/v^)!  (X^  +  xJ/2)^  +   Z  xJ]/[(Y^  +  X^/2)2  ^   J,  y2^ 


1=2  i=2 
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a  ;v^  ,^2 
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1/2 
then    f(^/     )    niay    be    rewritten    as 


f(xl/2)    =    p((x^    +   xl/2)2    >    u}    =    1    -    P{(X^    +    X^/2j2    ,    „|  ^ 

(  Al .  1 ) 

where    U    =    (v^/v^)[(Y^    .   X  ^^  )  2    ,    _z^    yJJF^^^^^^^    -    ^E^    X^ . 

Note  that  the  random  variable  U  is  independent  of  X^* 

If  X  |/   and  XT'!   denote  any  two  values  of  Xy       such 

that  X^/2  ^  x|-/2,  then  we  shall  prove  that  for  f(XJ-/2) 

defined  as  in  (Al.l),  f(^]^{^)  <  ^^^Yl^^'      ^°^ 

f(x|/^)  =1-7  g,^  (u)p(u)du  where  p(u)  is  the  p.d.f.  of  U, 

and  for  any  positive  number,  u',  g,^  (u')  denotes  the 

^  1 

1/2  2 
conditional  probability  that  (X^  +  X^  )   <  u',  given 

U  =  u'.   However,  this  conditional  probability  must  be  the 

same  as  the  unconditional  probability,  since  Xi  and  U  are 

statistically  independent. 

Thus  g^'s  (u')  is  the  probability  that  the  random 
^1 

1/2 
variable  X-j^  falls  in  an  interval  of  half  length  u' 

1/2 
centered  at  -X,   .   Since  X  ~  N(0,1),  this  is  a  decreasing 

function  of  X ;/  .   Therefore  g,*2  (u*)  -  g,^  (u')  >  0 

for  all  u'  >  0.    Hence, 


f(xl/2)  _  f(x]-/2)  =  J    [-g^^^(u)  +  g^^^(u)]p(u)du  <  0. 
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Thus  P{ F"      ,   ,  >  F       }  is  a  strictly  increasing 

function  of  X  -i  . 

We    note    that    this   proof    is    a   modification   of    the    proof 

that    Pf  F"  ,       ,     >    F  }    is    decreasing    in  X-, 

^     Vi,V27Xi,X2  a;vi,V2'  ^ 

(Scheffe,    1959,    p. 136). 


APPENDIX  2 
A  CONTROLLED  RANDOM  SEARCH  PROCEDURE 
FOR  GLOBAL  OPTIMIZATION 

W.  L.  Price  (1977)  describes  a  conceptually  simple 
random  search  procedure,  called  "a  controlled  random  search 
procedure  for  global  optimization,"  which  is  effective  in 
searching  for  global  minima  of  a  function  of  n  variables, 
with  or  without  constraints.   The  procedure  does  not  require 
the  function  to  be  dif ferentiable  or  the  variables  to  be 
continuous. 

An  initial  search  domain,  V,  is  defined  by  specifying 
upper  and  lower  bounds  for  each  of  the  n  variables,  and  a 
predetermined  number,  N,  of  trial  points  are  chosen  at 
random  over  V,  consistent  with  any  constraints.   The 
function  is  evaluated  at  each  of  the  N  trial  points  and  the 
position  as  well  as  the  value  of  the  function  at  each  point 
are  stored  in  an  array.  A' .   At  each  iteration  a  new  trial 
point,  P,  is  selected  randomly  from  a  set  of  possible  trial 
points  whose  positions  are  related  to  the  configuration  of 
the  N  points  currently  in  storage.   If  P  satisfies  the 
constraints,  the  function  is  evaluated  at  P  and  the  function 
value,  fp,  is  compared  with  fj^,  which  is  the  greatest 
function  value  for  the  N  points  already  in  storage. 
If  fp  <  f^  then  M,  the  point  in  storage  corresponding  to  f^, 
is  replaced,  in  the  array  A',  by  P.   If  p  fails  to  satisfy 
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the  constraints  or  if  f   >  f   then  the  trial  is  discarded 
and  a  new  point  is  chosen  from  the  potential  trial  set. 

As  the  algorithm  proceeds,  the  set  of  N  points  in 
storage  tend  to  cluster  around  minima.   As  Price  states, 
"the  probability  that  the  points  ultimately  coverge  onto  the 
global  minimum  (minima)  depends  on  the  value  of  N,  the 
complexity  of  the  function,  the  nature  of  the  constraints 
and  the  way  in  which  the  set  of  potential  trial  points  is 
chosen. " 

Price  notes  that  since  the  procedure  is  intended  to 

find  global  minima,  thoroughness  of  search  is  more  important 

than  speed  of  convergence,  but  if  the  procedure  is  to  be 

more  efficient  than  pure  random  search  the  probability  of 

success  (f   <  f  )  at  each  iteration  must  be  sufficiently 
p    m 

high.   His  procedure  reaches  a  compromise  between  the 

requirements  of  search  and  convergence  by  defining  the  set 

of  potential  trial  points  in  terms  of  the  configuration  of 

the  N  points  already  in  storage.   At  each  iteration  n  +  1 

distinct  points,  R, ,  R„,  ...,  R   ,,  are  chosen  at  random 

-1   -2       -n+1 

from  the  N  (N  >  n)  currently  in  storage  and  these  constitute 

a  simplex  of  points  in  n-space .   The  point  R    is 

arbitrarily  chosen  as  the  vertex  of  the  simplex,  and  the 

next  trial  point,  P,  is  taken  as  the  image  of  the  vertex 

with  respect  to  the  centroid,  G,  of  the  remaining  n 

points.   Thus  P  =  2G  -  R   ,.   He  notes  that  it  is  possible 
-    -    -n+1 

to  speed  up  covergence  by  selecting  the  vertex  as  the 
point  R. ,  i  =  1,  2,  ...,  n  +  1,  which  has  the  largest 
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function  value  of  the  points  Rw  R-/  ...»  R    but  this 

would  be  detrimental  to  the  thoroughness  of  the  search. 

The  version  of  Price's  procedure  used  in  the  work  in 

this  dissertation  was  programmed  in  the  FORTRAN  language  by 

Michael  Conlon  of  the  Center  for  Instructional  and  Research 

Computing  Activities,  the  University  of  Florida, 

Gainesville,  Florida.   This  version  of  Price's  procedure 

selects  new  trial  points  using  the  suggested  criterion 

P  =  2G  -  R   , .   The  algorithm  continues  until  an  iteration 
-     -    -n+1 

limit  is  reached  or  a  desired  tolerance  between  the  minimum 
and  maximum  function  values  in  storage  is  achieved. 

In  our  particular  application,  if  P2  =  1  so  that  A-^    is 
a  scalar,  we  wish  to  maximize 


A^  =  (X*  -  X*A)'Vq^(X*  -  X*A), 


with  respect  to  k  check  points,  in  order  to  maximize  the 
power  of  an  upper  tailed  test.   For  locating  check  points 
that  maximize  the  power  of  a  lower  tailed  test  it  is 
necessary  to  minimize  A-j^.   If  p   >  1  so  that  A-j^  is  not  a 
scalar,  but  is  a  P2XP2  matrix,  then  it  will  be  necessary  to 
maximize  or  minimize  certain  eigenvalues  of  A-,. 

All  of  these  optimization  problems  can  be  handled  by 
Price's  procedure.   Since  the  procedure  finds  minima,  then 
to  find  maxima,  we  simply  minimize  the  negative  of  the 
function  under  consideration.   The  restriction  that  the 
check  points  must  be  located  within  the  experimental  simplex 
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(or  a  subregion  of  the  simplex)  is  taken  care  of  by 
specifying  constraints  in  the  program. 

To  give  a  specific  example,  suppose  we  fit  a  second 
order  canonical  polynomial  model  in  a  three  component 
mixture  space,  using  a  simplex  centroid  design.   If  we 
assume  the  true  model  is  special  cubic  in  the  three 
components,  then  P2  =  1,  and 


^1  ^  ^^2  "  X*A)  •Vq-'-(X*  -  X*A) 


is  a  scalar  quantity.   In  order  to  locate  a  single  check 
point  that  maximizes  the  power  of  an  upper  tailed  test  for 
lack  of  fit,  we  select  the  check  point  that  maximizes  Ai. 
Since  the  experimental  region  we  wish  to  search  is  the 
entire  two  dimensional  simplex,  we  define  the  check  point 
as  x*'  =  (x  ,  x  ,  X  ),  and  in  our  program  impose  the 
constraints : 


and 


0  <  X   <  1, 


0  <  X   <  1. 


We  then  define  x-,  as  x^  =  1  -  x,  -  y.^,    while  requiring 
that  0  <  X  <  1.   Price's  random  search  procedure  is  used  to 
search  the  two-dimensional  simplex  for  the  point  (xj^,  X2) 
that  maximizes  K-^. 
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Price  suggests  the  use  of  N  =  50  storage  points  for 
such  a  two-dimensional  search,  and  we  have  generally  found 
this  to  be  adequate.   For  k  >  1  check  points  to  be  located 
simultaneously  in  a  three  component  system,  the  problem 
becomes  one  of  searching  in  2k  dimensions.   For  the 
applications  considered,  N  =  50k  appears  to  be  adequate. 

The  only  real  problem  encountered  has  been  that  of 
economics  in  that  the  procedure  becomes  costly  in  terms  of 
computer  time  for  these  situations  where  the  optimal  value 
of  the  function  is  assumed  by  all  points  in  a  region.   In 
these  cases  the  algorithm  searches  in  vain  for  points  that 
will  improve  upon  the  functional  values  already  in  storage, 
which  all  lie  in  this  optimum  region.   However,  in  other 
applications,  the  procedure  converged  quickly  to  an  optimum 
(those  that  converged  did  so  in  10,000  iterations  or  less, 
at  a  small  cost  in  computational  time). 


APPENDIX  3  _, 
STATISTICAL  INDEPENDENCE  OF  d • V   d /a   AND  SSE/a^ 

Let  us  write  d'v"  d   as 


-1      *   *  *    _i   *   *  * 
d'V   d  =  (Y   -  Y  )'V  -"(Y   -  Y  ) 


Y  'Vq  Y   -  Y  'Vq  Y   -  Y  'Vq-'y   +  Y  'Vq-^Y 


*   -1*       *   -1**    '*   -1** 
Y  •Vq-'y   -  2  Y  •Vq-'y   +  Y  'Vq-^Y  . 


Now  let  us  write  SSE  as 


SSE  =  Y'(I^  -  X(X'X)~^X')Y. 


Since  Y  and  Y   are  independent,  SSE  is  independent  of 

*   _l  *  *   _i  ^  * 

Y  'V»  Y  .   Rewriting  y  'V„  Y  as 


*   —1  *  *     *   —1  * 
1    '^0  -   "  -  '^0^  -1' 


where  b   is  the  least  squares  estimator  of  g  ,  we  have 


*   — 1"*     *   —1*      —1 
1    '^0  -   "  -  '^0^  (X'X)   X'Y. 
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We    now   show   that    the    second    portion   of   d'V^   d    is    independent 
of    SSE    if    and    only    if 


[v'-'-x   (x'x)    ■'■x'llijg  -  x(x'x)   -"-x']    =  0. 


Define 


COv[Y     'Vq-'"X    (X'X)     -""X-Y,     Y'(I^    -    X(X'X)     ■'"XMY] 


E[Y     'V    '''X    (X'X)~''"X'YY' (Ij^    -    X(X'X)     ■'"X')Y] 


-    E[Y     •Vq-'-X     (X'X)     ■'•X'Y]     E[Y'(Ijj    -    X(X'X)     -"-X'Y] 


* 


-1    *  -1  -1 


E(Y     ')     E[Vq    X    (X'X)       X'YY'dj^    -    X(X'X)       X')Y] 


-    E(Y     ')     E[Vq''"X     (X'X)~-'"X'Y]     E[Y'(Ij^    -    X(X'X)     ■'•X'Y] 


E(Y     ')      [cov(Vq"'"X    (X'X)~'''X'Y,     Y'(Ij^    -    X(X'X)~'''X'Y)] 


=    0, 


if    Vq    X    (X'X)    ■'"X'Y    is    independent    of    Y'(I      -    X(  X' X)  ""'"X '  )  Y. 
This   occurs    if    and    only    if. 


[Vq-'-X*(X'X)"-'-X' ]  [Ij^   -    X(X'X)"-'-X']     =    0, 


see    Searle    (1971)    p. 59,    Theorem    3.      Now, 


166 


[Vq^X  (X'X)"-'-X']  [Ij^  --X(X'X)  -^x'] 


=  Vq-'-x  (x'x)"-^x'  -  Vq-^x*(x'x)  ■'■x'x(x'x)"-'-x' 


=  0. 
Therefore  SSE  is  independent  of  the  second  portion  of 
d'V~  d.   Now  we  must  show  that  SSE  is  independent  of  the 
third  portion  of  d'V~  d.   Write  Y*'V~-'-Y*  as 


Y  'Vq  Y   =  (X  bj^)'VQ-'x  b^ 


=  Y'x(x'x)"-'-x  •Vq-'-x  (X'X)~-'-X'Y. 


Then  SSE  is  independent  of  the  third  portion  of  d'v"  d  if 
and  only  if 


[X(X'X)"-'-X*'Vq-'"X*(X'X)""^X']  [Ijj  -  X(X'X)  "'"X']  =  0, 


see  Searle  (1971),  p. 59,  Theorem  4.   Continuing  then. 


[X(X'X)  "'■X*'Vq''"X*(X'X)  """XMIIj^  -  X(X'X)  """X'] 


X(X'X)  •'■X*'Vq-'"X*(X'X)"-'-X'-X(X'X)  ■'•X*'Vq"'-X*(X'X)""'"X'X(X'X)~-'-X' 


X(X'X)~''-X*'Vq-'-X*(X'X)  """X-  -  X(X'X)  ■'■X*'Vq-'-X  (X'X)~  X' 


=  0. 
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Therefore  SSE  is  independent  of  the  third  portion  of 

Finally,  since  SSE  is  independent  of  each  of  the  three 
portions  of  d'V  d,  we  can  conclude  that  SSE  is  independent 
of  d'v"  d  and  therefore  SSE/a^  is  independent  of  d'V~'''d/a^. 


APPENDIX  4 
THEOREM  3.1 


Theorem  3 . 1 


Let  A  and  B  be  kxk  matrices.   If  (A  -  B)  is  positive 
definite  and  B  is  positive  semi-definite,  then  A  is  positive 
definite . 
Proof 

We  assume  that  (A  -  B)  is  positive  definite.   Then 
z'(A  -  B)z  >  0,  for  all  z  *  0.   Thus  z ' Az  -  z • Bz  >  0, 
for  all  z*    0,  so  that  z '  Az  >  z • Bz  >  0,  for  all  z  t    0, 
since  B  is  positive  semi-definite.   Therefore, 
z'Az  >  0,  all  z   t    0. 

Now  if  z'Az  =  0,  then  z  =  0  for  if  z   +    0, 
then  z'(A  -  B)z  >  0  implies  z • Bz  <  0.   But  this  is  a 
contradiction  since  by  assumption  z ' Bz  >    0. 
Therefore  z  must  be  0  and  A  must  be  positive  definite. 
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APPENDIX  5 
THEOREM  3.2 


Theorem  3.2 


Let  A-j^  and  A2  be  P2><P2  positive  semi-definite 
matrices.   Let  g   be  a  P2-dimensional  vector  and  define  X  ■, 


and  X  2  as 


2 
^1    ~   §.2^iS2/2a    ,    and 


2 

^2    ~    0  2^2—2^^'^     ' 


2 

where   a       >    0.      Let    k>0,    N>0,    p>0,    and    N    >   p. 


(a)  If    [A^/k   -   A^/ili   -  p)]    is    positive    definite    then 
[Xj_/k    -   A2/(N   -   p)]     =0    if    and    only    if    X  ■,_    =   x  2    =    0. 

(b)  If    [Aj^/k   -   A2/(N   -   p)]    is    negative   definite    then 
[Xj^/k    -  X2/(N   -   p)]    =0    if    and    only    if    x -^    =   X  2    =    0. 

Proof    of    part    (a)  . 

Necessity.      Let    [A^^/k   -   A^/CN   -   p)]    be   positive 
definite   and    suppose    that    [X^/k   -  X^/i^  -  P)]    =    0.        We    show 
that   Xj^    =  X2   =    0.      The    matrix    Aj^/k    -   A2/(N   -   p)    being 
positive    definite    implies    8'[a  /k    -   A  /(N   -   p)]0       =0      iff 

3      =    0    ,    that    is, 
-2         - 
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a^A^B^A    -   ^2^2-2'^^'^   -   P)    =    0    iff    e^   =    0-       ^ut. 


e^Aj^e^/^ka^    -   e_^A232/2(N    -   p)a^    =    0    iff    g^    =    0.       Hence 


X^/k    -   X2/(N    -   p)    =    0    iff    B^    =    0' 


It    follows    that    if   X -l/Ic   -  X2/(N   -   p)    =0,    then   x  ]^   =   X  2   =    0. 
Sufficiency.      Obviously,    if   X^    =   X2   =    0,    then 


X^/k    -   X^/CN    -   p)    =0-0=0. 


Proof  of  part  (b).   This  follows  from  part  (a),  since  in 


this  case  A  /(N  -  p)  -  A  /k  is  positive  definite. 


APPENDIX  6 
AN  APPROXIMATION  TO  THE  DOUBLY  NONCENTRAL  F  DISTRIBUTION 

Johnson  and  Kotz  (1970,  p.  197)  indicate  the  following 

approximation  for  P{  F,"^  ^,^  .,  ^  ^,^<  F^.,^,,^}  where  vi  and  .^ 

are  the  numerator  and  denominator  degrees  of  freedom, 

respectively,  and  X -^  and  A  2  are  the  numerator  and 

denominator  noncentrality  parameters,  respectively: 


P{F" 


Vl/V2;Xi,X2  "^  ^a;vi,V2J       ^^^^v,v'  ^  ^aj\>i,V2^ 


=  P{F    ,  <  (l/c)F        } 
VfV  '      ^    a  ;vi  ,V2^ 


where  F^.^^^^^  is  the  upper  100a  percentage  point  of  the  cen- 
tral F  distribution  with  v^   and  V2  degrees  of  freedom,  and 
where  c  =  [1  +  X^/v^]/[l  +  X  ^^v  ^]  ,    v  =  [v^  +  X^]^/lv^   +  2X^], 
v'  =  [^2  +^2^  /^^?  "^  2X],  and  F    ,  is  a  central  F  random 
variable  with  v  and  v'  degrees  of  freedom. 
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APPENDIX  7 

EQUIVALENCE  OF  SSEg  AND  SSrQp  WHEN 

REPLICATES  REPLACE  NEAR  NEIGHBOR  OBSERVATIONS 

In  this  appendix  we  show  that  SSEg  =  SS^qf  ^^^^ 
response  observations  are  partitioned  into  g  groups  of  true 
replicates  rather  than  g  groups  of  near  neighbor 
observations. 

From  Chapter  Two,  Section  2.2,  if  each  cell  consists 
entirely  of  true  replicates,  then  the  sum  of  squares  due  to 
lack  of  fit  can  be  expressed  as 


SS,^^  =  SSE  -  SSE 
LOF  pure 


where  SSE  is  the  residual  sum  of  squares  from  a  least 

squares  regression  of  Y  on  X  and  where  SSEp^^.^  is  the  sum  of 

squares  due  to  pure  error,  calculated  from  replicates. 

Since  SSE     =  Y'E.Y,  where  Z .  is  defined  as  in  Section 
pure    -   0-         0 

4.2,  we  have 


^^LOF  =  ^'^^N  "  X(X'X)  ^X')Y  -  YTqY 


=  Y'(Ijj  -  J:q)Y  -  Y'X(X'X)"-'-X' Y.      (A7.1) 


We  wish  to  show  that  when  each  cell  is  composed  entirely  of 

true  replicates,  SSEg  ig  equal  to  the  expression  in  (A7.1). 
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Recalling    from   Section    4.3.1    that   Y      =   MY,    where 
M   =   diag[(l/n    )1',     ...,     (1/n    )1'],    we    write 

SSE3    =    Y'[G-1    -    G-\(X^G-\)-^X^G-^]Y^ 


=    X'M'[Go^    -    G-^Xc(X^G-^Xc)-^X^G-^]MY, 


where  from  Section  4.2,  G   =  diag[l/n  ,  1/n  ,  ...,  1/n  ] 
Recognizing  that  Gq  =  MM'  and  X^   =   MX,  we  have 


SSE   =  Y'[M' (MM* )   M 


B 


-  M' (MM' )~''"MX{X'M' (MM*  )~'''MX}   X'M'(MM')   M]Y. 


Since  M'(MM')~  M  =  I   -  z    ,      we  have 


SSE3  =  Y'(I^  -  Zq)Y 


-  Y'd^  -  ^o^^^^'^^N  -  ^O^^J'^^'^^N  -  ^0^^ 


and  since  E qX  =  0  when  all  cells  are  composed  entirely  of 
true  replicates,  we  have 


SSE„  =  Y'(I,,  -  J:^)Y  -  Y'X(X'X)"-'-X'Y 
B    —    N      0  —    —  - 


=  S^LOF' 
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from  (A7.1).   Therefore,  SSEg  is  equal  to  the  usual  SSlqf 
when  cells  are  composed  entirely  of  true  replicates. 


APPENDIX  8 
LEMMA  4.1 


Lemma  4 . 1 


2 
Let  Y  ~  (XB,  a  G),  G  singular.   Define  T  =  G  +  XX'. 

Define  T~  such  that  TT~T  =  T. 

1.  (i)   TT~X  =  X 
(ii)   X'T~T  =  X' 

2.  rank(X'T~X)  =  rank(X) 

3.  (i)   X(X'T~X)~(X'T~X)  =  X 
(ii)   (X'T~X)(X'T~X)~X'  =  X' 

4.  Y  is  in  the  column  space  of  T  (Ye  C(T)),  with 

probability  one,  by  which  we  mean  that  there  exists  a 

vector  a  such  that  letting  Y  =  (y  ,  y  ,  ...,  y  )' 

and  Ta  =  (x, ,  x^,  ...,  x^J ' ,  then 
-      1    2         N 

P{ |y^  -  x^l  >  e}  =  0  ,for  all  e  >  0,  i  =  1,  2,  ...,  N. 
Proof 

1.    (i)   T  =  XX'  +  G 

=  XX'  +  W,  where  G  =  W 

=  CC  ,  where  C  =  [X:V]  . 
Now,  CC'(CC')~C  =  C,  from  Pringle  and  Rayner  (1971,  p. 
26),  and  therefore 

TT~[X:V]  =  [X:V] 

from  which  it  follows  that  TT~X  =  X  (and  TT~V  =  V). 
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(ii)   The  proof  of  (ii)  follows  directly  from  (i)  by 
taking  the  transpose. 

2.  The  proof  of  part  2  is  given  in  Rao,  1973,  p.  77,  #30. 

3.  The  proof  of  part  3  is  given  in  Rao  and  Mitra,  1971, 
p. 22,  Lemma  2.2.6(c). 

4.  By  definition,  Y  ~  ( Xg  ,  a^G)  so  that  an  equivalent 
representation  for  Y  is  Y  =  Xg  +  e  ,  where  e  ~  (0,  a^G). 
We  wish  to  show  that  the  random  vector  Y  is  in  the 
column  space  of  T,  with  probability  one.  It  is 
sufficient  to  show  that  TT~Y  =  Y,   w.p.l.  (see  Pringle 
and  Rayner,  1971,  p. 9).   Rewriting  TT~Y  we  have 

TT~Y  =  TT~(Xe  +  £) 

=  TT~X6  +  TT~£. 

By  part  1  of  Lemma  4.1,  TT~X  =  X,  and  therefore 
X3  e  C(T).   The  proof  is  complete  if  we  show   TT~e  =  e, 
w.p.l.   The  difference  TT~e  -  e  can  be  written  as 
TT~~e  -  e  =  (TT~  -  Ifj)e/  therefore  we  must  show  (see 
explanation  below)  that 

E[e'(TT~  -  Ij^)'(TT~  -  I^)e]  =  0.         (A8.1) 

The  expectation  in  Eq.  (A8.1)  can  be  written  as 
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:[£'(TT   -  1^)  '(TT   -  l^)e] 


=  trace  [ (TT   -  I„)'(TT   -  I^Ja^G  1 


2 

o 


trace  [  (TT   -  I^^ )  '  ( TT   -  I  )W'] 


=  0 


since  TT~V  =  V,  by  proof  of  part  l(i)  of  Lemma  4.1. 
Therefore  Y  e  C(T) ,  w.p.l. 

We  now  show  that  proving  the  equality  in  (A8.1)  is 
equivalent  to  proving  that  TT~£  =  e,  w.p.l.   By  the  Markov 
Inequality 


P{  |u.  -  v.|  >  e}  <  [E(u.  -  v.)2]a2 


and  therefore  if  E(Uj^  -  Vj^ )   =  0,  we  have  Uj^  =  v^,  w.p.l. 

If  u'  =  (u^,  U2,  ...,Uj^),  V'  =  (Vj^,  w^,     ...,  Vj^),  and  if 

2 
E(u^  -  v^)   =  0,  for  i  =  1,  2,  ...,  N,  then  u^    =  Vj^,  w.p.l, 

for  i  =  1,  2,  ...,  N,  which  implies  that  u  =  v,  w.p.l.   But 

E(u^  -  Vj^)^  =  0,  for  i  =  1,  2,  ...,  N  if  and  only  if 

"^  2 

Z      E(u.  -V.)   =0,  and  since 

i=l     ^     ^ 

^  2 

I    E(u.  -  V. )   =  E(u  -  V) ' (u  -  V) 

i=l   ^    ^ 


we  have  u  =  v,  w.p.l,  if  E(u  -  v)'(u  -  v)  =  0.   In  (A8.1)  we 
take  u  =  TT~£  and  v  =  e . 


APPENDIX  9 
PROOF  OF  THEOREM  4  . 1 ( i  ) 

In  this  appendix  we  give  the  proof  of  Theorem  4.1(i). 

We  show  that  E(a  )  =  a  ,  where  a      =   f~  (Y  -  X3)'T~(Y  -  Xe ) 

"  2 
First  we  write  a      as 


^2  ^  f      Ir „,m 


=  f  -^[Y'T  Y  -  2B'X'T  Y  +  3'X'T  Xg]  , 


where  $    =  (X'T  X)  X'T  Y.  Now, 


e'X'T  Xe  =  B'X'T  X(X'T  X)  X'T  Y 


=  g 'X'T  Y, 


by  Lemma  4.1,  part  3(ii).   Therefore 
a^    =    f"-'-[Y'T~Y  -  B'X'T~Y] 

=  f"-'-[Y'T~Y  -  {  (X'T~X)~X'T~Y}  'X'T~Y] 
=  f~-'-[Y'T~Y  -  Y' (T~)  •X(X'T~X)~X'T~Y] 


f'-'-Y'A  Y  (A9.1) 


where   A   =  T   -  (T  )'X(X'T  X)  X'T  .  Using  equation  ( A9 . 1 )  , 
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and  applying  Theorem  l(i)  (Searle,  1971,  p. 55),  we  can  write 
the  expected  value  of  a      as 


E(P)    =  f  "^E[Y'A  Y] 


-1  2 

f      [traceJA^a    G}    +    E(Y)'AqE(Y)] 


—1  2 

f        trace [AqO    G] ,  (A9.2) 


since 


E(Y)'AqE(Y)     =    0_'X'[T      -     (T     )'X(X'T    X)     X'T    ]  Xg 


=    B'X'T    Xg     -    B'(X'T    X)(X'T    X)     (X'T~X)e 


=    0 


as    X'(T    )'X    =   X'T   X,    because    T   is    symmetric    and 

X'T~X      is    unique    (see    proof   of    Theorem    4.1{ii)).      Thus 


2.    _    .-1    .„_„r^^2. 


E(a')    =    f    -"    trace[A  a^G] 


2    -1 
0    f        trace [A-G] 


2    -1 
a    f        trace [  A»(T   -   XX' ) ] 


By   writing    Aq   as    in    Eq.     ( A9 . 1 ) ,    we    get 
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^2       2  —1  — 

E(a  )   =  a  f    trace  [{t   -  (T  )'X(X'T  X)  X'T  }{t  -  XX'}] 


2  _■)  

=  a  f    trace  [t  T  -  T  XX'  -  (T  ) ' X( X 'T~X)~X'T~T 


+  (T  ) 'X(X'T  X)  X'T  XX'] 


2  _■]         

o  f   [trace  T  T  -  trace  T  XX' 


-  trace  (T  )'X(X'T  X)  X'  +  trace  (T~)'XX'], 


by  Lemma  4.1,  parts  1  and  3,  and  so 


^2  2  —1         

E(a  )  =  o  f   [trace  T  T  -  trace  (X'T  X)  (X'T  X)], 


since  X'(T~)'X  =  X'T~X.  Since  T~T  and  ( X'T~X)~( X'T~X)   are 
idempotent,  and  rank(AA~)=  rank (A)  for  any  matrix  A,  we  see 
that 


E(a^)  =  o^f  ■'•[rank(T)  -  rank(X'T  X)] 


and  by  Lemma  4.1  part  2  we  have 


E(a^)  =  a^f  "'■[rank(T)  -rank(X)]. 
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Finally,  since  f  =  rank(G:X)  -  rank(X),  we  can  write 


'^  2      2  —1  o 

E(0  )  =  0  f   [rank(G;X)  -  rank(X)]  =  a  .         ( A9 . 3 ) 


The  proof  of  Theorem  4.1(i)  is  now  completed  by 
justifying  the  equality  in  (A9.3)  by  showing  that  rank  (T)  = 
rank  (G:X).   First  we  write 

rank(T)  =  rank(G  +  XX'). 
Replacing  G  by  W  ,  we  have 

rank(T)  =  rank(W'  +  XX') 

=  rank(CC'),  where  C  =  (V:X) 
=  rank(C) 
=  rank(V:X) 
=  rank(G:X), 

since  the  column  space  of  G  is  the  same  as  the  column  space 
of  V.   The  column  space  of  G  is  the  same  as  the  column  space 
of  V  if  the  columns  of  V  belong  to  the  column  space  of  G, 
and  vice  versa,  if  the  columns  of  G  belong  to  the  column 
space  of  V.   Symbolically,  this  is  written  as 
V  <=  C(G),  and  G  c  C(V).   To  show  that  V  c  C(G),  it  is 
sufficient  to  show  that  GG~V  =  v,  but  this  is  true  because 
GG~V  =  (W')(W')-V  =  V.    Now,  G  =  C(V)  since  by 
definition  W  =  G. 


APPENDIX  10 
PROOF  OF  THEOREM  4.1(ii) 

In  Appendix  10  we  prove  part  (ii)  of  Theorem  4.1,  thus 

we  show  that  a  =  f~  (y  -  X8)'T~(Y  -  Xg)  is  unique  with 

probability  one.   The  following  theorem  will  be  useful  in 

our  proof . 


Theorem  vi(c)  (Rao,  1973,  p. 26). 


Let  B  and  D  be  non-null  matrices.   Then  BA~D  is 

invariant  for  any  choice  of  A~   if  and  only  if 

C(B')  c  C(A')  and  C(D)  =  C(A),  where  C( . )  denotes  column 

space. 

The  relationship  C(B')  c  C(A')  holds  if  and  only  if  BA~A  =  B, 

and  similarly  C(D)  <=  C(A)  holds  if  and  only  if  AA~D  =  D  (see 

"  2 
Pringle  and  Rayner,  1971,  p. 9).   Since  the  quantity  a   is 

written  as 


^2 
a 


=    f  ■'•[Y'T  Y  -  Y'(T  )'X(X'T  X)  X'T  Y]  , 


"  2 
to  show  that  a   is  unique  with  probabilty  one,  it  suffices 

to  show  that 


Y'T  Y  -  Y'(T  )'X(X'T  X)  X'T  Y  (AlO.l) 
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is  invariant  with  probability  one  to  the  choice  of  the 
generalized  inverses  involved. 

First  we  show  that  Y'T~Y  is  unique  with  probability  one 
(w.p.l).   From  part  4  of  Lemma  4.1,  Y  e  C(T),  w.p.l. 
Therefore  Y'  e  C(T'),  w.p.l,  since  T  is  symmetric,  and  then 
by  Theorem  vi(c)  (Rao,  1973,  p. 26),   Y'T~Y  is  unique,  w.p.l. 

Secondly  we  show  that  Y' (T~) 'X( X'T~X)~X'T~Y  is  unique 
with  probability  one  in  the  following  four  part  proof. 

(1)  Show  X'T~X  is  unique. 

From  part  l(i)  of  Lemma  4.1,  TT~X  =  X  and  thus  X<=C(T). 
Since  T  is  symmetric,  we  have  X'^CCT').   By  Theorem  vi(c) 
(Rao,  1973,  p. 26),  X'T~X  is  unique. 

(2)  Show  X'T~Y  is  unique,  w.p.l. 

By  (1)  above,  XcC(T')  and  by  part  4  of  Lemma  4.1,  Y  e  C(T), 
w.p.l.   Thus  applying  Theorem  vi(c)  (Rao,  1973,  p. 26), 
X'T~Y  is  unique,  w.p.l. 

(3)  Show  Y'(T~)'X  is  unique,  w.p.l. 

This  follows  from  part  (2),  since   Y'(T~)'X  is  equal  to  the 
transpose  of  X'T""Y  ,  which  was  shown  in  (2)  to  be  unique, 
w.p.l. 

(4)  Using  (1),  (2),  and  (3)  above,  the  second  quantity 
in  (AlO.l)  is  unique,  w.p.l,  by  Theorem  vi(c)  (Rao,  1973, 

p. 26)  if 

(a)  [Y'('r~)'X]'  e  C[(X"r^X)'],  w.p.l,  and 

(b)  X'T~Y  e  C(X'T~X),  w.p.l. 

Part  (a)  is  true  not  only  with  probability  one  but  always 
because  Y' (T~) •X(X'T~X)~(X'T~X)  =  Y'(T~)'X,  since  by  part 
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3(i)  of  Lemma  4.1,  X  =  X(X'T~X)~  (X'T~X).   Part  (b)  is  true 
not  only  w.p.l  but  always  because 

(X'T~X)(X'T~X)~  X'T~Y  =  X'T~Y,  by  part  3(ii)  of  Lemma  4.1. 
Therefore  we  have  shown  that  both  Y'T~Y  and 

Y* (T~) •X(X"I^X)~X"r"Y  are  unique  with  probability  one,  which 

^  2 
allows  us  to  conclude  that  a      is  unique  with  probability 

one . 


APPENDIX  11 
PROOF  OF  THEOREM  4.1(iii) 

In  this  appendix  we  prove  Theorem  4.1(iii),  that  is  we 

show  that  if  Y  possesses  an  N-variate  normal  distribution 

^22      2 
then  fa    /a        ~  Xf  '  where  f  =  rank(G:X)  -  rank(X).   Recall 

that   a   =  f   Y'AqY  where  A   =  T   -  (T  )'X(X'T  X)  X'T  . 

Since  we  have  shown  in  Theorem  4.1(ii)  that  a^  is  unique 

with  probability  one,  the  choice  of  the  generalized  inverses 

"  2 
m  the  expression  for  a   may  be  made  arbitrarily.   Thus  we 

choose  each  of  the  generalized  inverses  to  be  the  unique 

Moore-Penrose  inverse,  and  we  denote  the  unique  Moore- 

Penrose  inverse  of  a  matrix  B  by  B"^.   The  Moore-Penrose 

inverse  has  the  following  four  properties  (see  Searle,  1971, 

p. 16): 

1.  BB+B  =  B 

2.  B+BB+  =  B"^ 

3.  (BB""")  '  =  BB"*" 

4.  (B+B)'  =  B+B. 


"2      2 
The  quantity  fa  /a      can  be  expressed  as 


fa^/a^  =  Y'AY,  (All.l) 


where  a  =  (l/a^)[T"^  -  t"^X(  X't'^X)  "^X't"^]  .   We  wish  to  show 
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2 
that   Y'AY  ~  Xf  '  which  can  be  done  by  making  use  of  the 

following  corollary. 


Corollary  2s. 1  (Searle^  1971,  p. 69). 

When  X  is  N(u,V)  whether  V  be  singular  or  non-singular, 

2 
X '  Ax  ~  x'   with  degrees  of  freedom  equal  to  trace (AV) 

and  noncentrality  parameter  equal  to  (l/2)y'Au,  where 

2 
x'    denotes  a  noncentral  chi-square  random  variable, 

if  and  only  if 

(i)   VAVAV  =  VAV 

(  ii  )   M_  •  AV  =  u  •  AVAV,  and 

(  i  i  i  )   U  '  Ay  =  jj  '  AVAy  . 

In  our  application,  the  matrices  A  and  V  in  Corollary  2s. 1 
(Searle,  1971,  p. 69)  are  defined  as 


A  =  (l/a^)[l  -  T'^X(X'T'^X)'*'X']t'^, 


and 


2 
V  =  a  G 


The  proof  of  Theorem  4.1(iii)  follows  from  Corollary  2s. 1 
(Searle,  1971,  p.  69)  if  we  can  show  that  AVA  =  A.   To  show 
that  AVA  =  A,  we  first  show  that  AG  =  AT,  where  as  we 
recall,  AG  =  A(T  -  XX').   Thus  AG  =  AT  if  AXX'  =  0.   Using 
the  complete  expression  for  A,  we  have 
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AXX'  =  {l/a^)[T'*"  -  t'^X(X't''"x)  "''x't'^JXX' 


=  (1/o^)[t'^XX'  -  t"^X(X'T"^X)'*'(X'T"^X)X' ]  , 
and  so  by  Lemma  4 . 1  part  3 ( i ) , 

AXX'  =  (l/a^)[T'^XX'  -  t'^XX'] 
=  0. 


2        2 
Therefore,  since  AG  =  AT,  we  have  AVA  =  a  AGA  =  a    ATA.   We 

2 
now  show  that  a    ATA  =  A: 


^ATA  =  (l/a^)[l  -  T"^X(X'T'^X)'*'X']t'''t[I  -  T'''X(  X'T'^X)  "^X' ]  T"^ 


(l/a^)[T'^T-T'^X(X'T"^X)'*'X'T'^T  ]  [  t'^-t'^X(  X  '  t'*'x)  '*"x'T'^] 


(1/o^)[t"^tt'^  -  t'^x(x't"^x)'*"x't"^tt'*" 


-  t'''tt"*"x(X't'*"x)"'"x*t''' 


+  t'*"x(x't"''x)''"x't"*"tt''"x(x't'''x)'*"x't''"] 


(l/a^)[T"^  -  t'^X(X'T'*"X)''"X'T'^  -  T'''X(  X 'T'''X)  """x*  t"*" 


+  T"'"X(X'T'*"X)"''X'T'*'X(X'T'''X)'^X'T"^] 
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since  t+TT"^  =  T+,  by  property  2  of  the  Moore-Penrose 
inverse.   Therefore, 

a^ATA  =  (l/a^)[T'^  -  2T'^X(  X'T'*'x)  "^X't"^  +  t'^XC  X'T'^X)  "^X' t"^] 


=  (l/a^)[l  -  t'^X(X'T'^X)'^X']t'^ 


=  A. 


2 
Since  we  have  verified  that  AVA  =  a  ATA  =  A  we  can  conclude 

that  fa^/a^  =   V  ^1  ~    Xf    '    ^Y    Corollary  2s. 1  (Searle,  1971, 

p.  69).   The  quantity  fa    /a      ~    Xf  ^^<^    "ot  Xf^'    since  the 

noncentrality  parameter  equals  zero,  which  we  now  show. 

The  noncentrality  parameter,  from  Corollary  2s. 1 

(Searle,  1971,  p. 69)  is  of  the  form  (l/2)u'Au,  where  in  our 

application,  y  =  Xg.   Thus, 

M  '  Ay  =  0  '  X  •  AX6 


(l/a^)0  'X'[t'^  -  t'^X(X't'*"x)'^X't'^]XB 


=  (l/a^)[0  ■X't''"X3  -  e 'X'T'^X(X't"^X)'^(X'T'^X)0] 
and  so  by  Lemma  4.1  part  3(i), 

y'Ay_  =  (l/a^)[0  •X'T''"X3  -  S'X'T'^XB] 
=  0. 
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We  now  verify  that  the  degrees  of  freedom  are 
f  =  rank(G:X)  -  rank(X).   From  Corollary  2s. 1  (Searle,  1971, 

p. 69)  the  degrees  of  freedom  associated  with  Y'AY  are  equal 

2 
to  f  =  trace (o  AG),  and  so 

trace(a^AG)  =  trace[  I  -  t'^X(  X'T"^X)  "''X' Jt'^G 

=  trace(T"^G)  -  trace[T"*"X( X'T'''X)  "^X'T'*'g] 

=  trace(T"*'T  -  t'''xX')  -  trace[T'*'X(X'T'*"x) '''x't'^'t] 

+  trace[T'''x(X'T'^X) '^X'T"''xX' ]  , 

since  G  =  T  -  XX'.   It  follows  that 


trace(a  AG)  =  trace  t'*"t  -  trace  t"''xX'  -  trace  X(  X't'''x)  """x' t"^ 


+  trace  t'''xX'  , 


since  trace  AB  =  trace  BA  for  arbitrary  matrices  A  and  B, 
T'^TT'^  =  T"^,  and  X(  X'T'^X)  "*■(  X  •T'^X)  =  X  by  Lemma  4.1  part 
3( ii  )  .   Therefore 


trace(o  AG)  =  trace  t"^!  -  trace  (X'T"^X)  (X'T"*'X) '*' 


=  ran]c(T)  -  rank(  X't'^X)  , 


190 


since  TT   and  (X'T  X)(X'T  X)   are  idempotent,  and 


rank(AA  )  =  rank  A,  for  any  matrix  A.   Finally,  by  Lemma  4.1 
part  2  we  have 


trace (a  AG)  =  rank(T)  -  rank(X) 


and  by  the  argument  in  the  proof  of  Theorem  4.1(i), 


2 
trace (a  AG)  =  rank(G:X)  -  rank(X) 


APPENDIX  12 
PROOF  OF  THEOREM  4.2 

In  this  appendix  we  prove  Theorem  4.2,  thus  we  show 

that  when   Y~  Nj^(X3  +  X  B  ,  a^G)  then   fa^/a^  ~  X^^^, 

where   X  =  ( l/2a ^ )e • x' [T~  -  T~X(X'T~X)~X'T~]X  3  . 

"11  1 

From  the  proof  of  Theorem  4.1,  we  have  fa  /a   ~  \\,\ 

By  Corollary  2s. 1  (Searle,  1971,  p. 69)  the  noncentrality 

parameter  is 


X  =  (1/2)(X0  +  Y.^_^)'A{Y.^_    +  X^P^), 


where   A  =  (l/a^)[T   -  T  X(X'T  X)  X'T  ].   Thus 
X  =  (l/2)[6  'X'AX3  +  3'X'AX  3   +  3'X'AX3  +  ?.^X' AX  3  ]  . 
From  the  proof  of  Theorem  4.1(iii),  3'X'AX3  =  0.   We  now 
show  that  3'X'AX3  =  0: 


3'X'AX3  =  lo^U'^   ~  '^  X(X'T  X)  X'T  ]  X3  / 


a2 


=  3'X'[t  X3  -  T  X(X'T  X)  (X'T  X)3]/a2, 


and  so  by  Lemma  4.1  part  3(i), 


3'X'AX3  =  B'X'[t  X3  -  T  X3  ]  / 


=  0. 
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a2 
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Thus  we  conclude  that 


X    =  (l/2)e^X^AX2e2 


=  (l/2a  )eiXi[T   -  T  X(X'T  X)  X'T~]x^S,. 
z  z  J  2—2 


APPENDIX  13 

PROOF  OF  THE  EQUALITY  SSE,  =  d'v"-^d  +  SSE 

A     ~    (J  " 

In  this  appendix  we  show  that  the  check  point  method 
for  testing  a  fitted  model  for  lack  of  fit  and  the  method  in 
which  the  residual  sum  of  squares  is  partitioned  into  a  lack 
of  fit  sum  of  squares  and  a  pure  error  sum  of  squares  are 
equivalent  in  the  sense  that  SSE   =  d'V~''"d  +  SSE. 

A     ~    (J  " 

Let  us  define  Y,  and  X,  as 
-A       A 


Xa  = 


Y 
Y* 


(A13.1) 


and 


^A  = 


X 

X* 


(A13.2) 


Then  the  residual  sum  of  squares  from  regressinq  Y 


ng  Y^  on  X,  is 
-A     A 


-1, 


SSE^  =  Y.  [I  -  X^(X'X^)-X]Y^. 
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Using  Eqs.  (A13.1)  and  (A13.2)  we  can  write  SSE^  as 


SSE 


Y 
Y* 


I  - 


X 
X' 


(X'X  +  X*'X*) 


-1 


X  I  • 
X*l 


Y 
Y* 


-1. 


=  Y*'[I  -  X*(X'X  +  X*'X*)   X*']Y 


-  2  Y*'X*(X'X  +  X*'X*)  "'"X'Y 


+  Y'[I  -  X(X'X  +  X*'X*)"''"X' ]  Y 


Y*'V~  Y*  -  2Y*'X*(X'X  +  X*'X*)~'''X' Y 


+  Y'  [I  -  X(X'X  +  X*'X*)  ■'■X'lY. 


(A13.3) 


Eq.  (A13.3)  is  true  because  from  Eq.  (8)  (Morrison,  1976,  p. 
69)  we  can  write  vl  as 


V~  =  [I  +  X*(X'X)  ■'■X*'  ]  ■'" 


=  I  -  X*(X'X  +  X*'X*)  """X 


*  I 
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We  now  write  the  quadratic  from  d'V^''"d  as 


d'vJ'-d  =  (Y*  -  Y*)'V^(Y*  -  Y*) 


Y*'v~-'-Y*  -  2Y*'V"-'-Y*  +  Y*'V"-'-Y* 


=  T'^o^I*   -   2  Y*'Vq-'-X*(X'X)"-^X'Y 

+  Y'X(X'X)"-'-X*'Vq-'-X*(X'X)"-^X'Y.    (A13.4) 

The  first  portion  in  Eq.  (A13.3)  is  equal  to  the  first 
portion  in  Eq.  (A13.4).   We  now  show  that  the  second 
portions  of  Eqs.  (A13.3)  and  (A13.4)  are  equal.   It  can  be 
verified  using  Eq.  (8)  (Morrison,  1976,  p.  69)  that 


(X'X  +  X*'X*)  ^  =  (X'X)  ^  -  (X'X)"^X*'Vq^X*(X'X)"^. 


=  iA-A}    -  (X'X)  -X*'Vq-X*(X'X) 

( A13.5) 


Using  Eq.  (A13.5)  the  second  portion  of  SSE^  in  Eq.  (A13.3) 
can  be  written  as 


-"■v  I  • 


-  2  Y*'X*(X'X  +  X*'X*)   X'Y 


-2Y*'X*[(X'X)  ^  -  (X'X)  ^X*'V'^X*(X'X)~^]X'Y 


•2Y*'[I  -  X*(X'X)  ■'■X*'Vq-'-]X*(X'X)"-'-X'Y. 


(A13. 6) 
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The  second  portion  of  SSE^,  given  in  Eq.  (A13.6),  is  seen  to 
equal  the  second  portion  of  d'v"  d  in  Eq.  (A13.4)  using  the 
fact  that 


I  -  X*(X'X)~-'-X*'Vq-'-  =  I  -  (Vq  -  DVq-^ 


-0^- 


We  now  show  that  the  third  portion  of  the  expression 
for  SSEp^  in  Eq.  (A13.3)  is  equal  to  the  sum  of  the  third 
portion  of  d'v"  d  in  Eq.  (A13.4)  and  SSE,  where 
SSE  =  Y'[l  -  X(X'X)~"'"X' ]  Y.   Using  the  result  in  Eq.  (A13.5), 
the  third  portion  of  SSE^  in  Eq.  (A13.3)  can  be  written  as 


Y'  [I  -  X(X'X  +  X*'X*)  ■'•X'  ]Y 


Y'[I  -  X{(X'X)  ■*■  -  (X'X)~''"X*'Vq''"X*(X'X)  -"-IXMY 


Y'  [I  -  X(X'X)"-'-X']Y  +  Y'X(X'X)  ■'•X*'Vq-'-X*(X'X)"-'-X'Y. 


Therefore,  since  the  first  two  portions  of  SSE^  in  Eq. 
(A13.3)  are  equal  to  the  first  two  portions  of  d'v"  d  in  Eq. 
(A13.4),  respectively,  and  the  third  portion  of  SSE^  in  Eq. 
(A13.3)  is  equal  to  the  sura  of  the  third  portion  of  d'V-  d 
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in  Eq.  {A13.4)  and  Y'[I  -  X(X'X)  ■'•X*]Y,  we  must  have  then 


SSE   =  d'V~"'"d  +  Y'[I  -  X(X'X)  ''"X'jY 

A     ■"   U  ■"     ^  — 


=  d'V~"'"d  +  SSE. 
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